Medmorf — Medical Data Transformation

Text Translation

File Translation

Drop .xlsx or .docx file here, or browse

Download example file

Checking model status...

Model not in memory

Detect and replace personal data (PII) in medical documents — entirely in your browser. Nothing is uploaded.

Auto-detecting best method… Checking your hardware.

Checking WebGPU…

Checking resource headroom…

Advanced settings — model selection

Active pipeline: auto Turn on one detector, one LLM, or both.

NER / detector

CPU-friendly span detectors

LLM

WebGPU Qwen verification

How it works · models & hardware

LLM only (recommended on WebGPU devices): Qwen3 handles all PII detection in a single pass. Best accuracy, especially with the 4B or 8B model.

NER + LLM: Run a NER model first, then let the LLM verify and catch remaining PII. Useful for additional coverage or comparison.

NER only: Run just the selected NER model. Fastest option, no GPU required, but lower accuracy.

Hardware: LLM modes need a modern GPU or Apple Silicon Mac with WebGPU (Chrome/Edge 113+, Safari 18+). The 4B model needs ~3.4 GB GPU memory; pick a smaller model on devices with limited VRAM. NER-only mode runs on CPU.

Qwen3 0.6B — ~1.4 GB VRAM. Smallest & fastest LLM. Good for quick scans.
Qwen3 1.7B (preferred default when feasible) — ~2 GB VRAM. Good balance of speed and quality.
Qwen3 4B — ~3.4 GB VRAM. Higher quality, needs more headroom.
Qwen3 8B — ~5.7 GB VRAM. Highest quality, needs a powerful GPU.
Multilingual PII NER — ~280 MB. XLM-RoBERTa, no GPU needed. Names, addresses, dates, IDs in 8+ languages.
GLiNER PII Edge — ~46 MB. Zero-shot, no GPU needed. Best for English.
Multilingual BERT NER — ~100 MB. Lighter general-purpose NER (people, places, organizations).
OpenAI Privacy Filter — ~1.5 B params (q4 ≈ 800 MB). OpenAI's bidirectional token classifier with 8 PII categories: person, email, phone, address, date, URL, account number, secret. Runs in-browser via Transformers.js + WebGPU; WASM/CPU is not supported for its quantized embedding op in this browser stack. Primarily English with multilingual robustness reported. Pick it under Advanced settings → NER Model with the NER only or NER + LLM pipeline.

Always manually review the output. First load downloads the selected model; subsequent runs use the browser cache.

⚠️ Disclaimer: Anonymization is not guaranteed to detect all PII. Always manually review the output before sharing documents. This tool is an aid, not a replacement for human review.

Medical Document

Drop document here

.pdf, .xlsx, .docx, .txt

Mapping File Optional

Drop mapping file

.xlsx or .json

No mapping loaded

Example document · Example mapping (.xlsx)

Or paste text directly

Anonymization Complete

Detection Breakdown

NER Found

Entity	Type

LLM Found

Entity	Type

NER Filtered (False Positives Removed by LLM)

Entity	Type

LLM Added Beyond NER

Entity	Type

Entity Mapping

Original	Type	Replacement	Actions

Preview

Hardware: Requires WebGPU (Chrome/Edge 113+, Safari 18+). Uses the same Qwen3 models as anonymization.

Generate structured reports from medical documents using WebLLM + WebGPU — entirely in your browser.

Template

LLM Model

Checking WebGPU...

Upload Document

Drop document here

.xlsx, .docx, .txt

Voorbeeld intakegesprek (NL)

Or Paste Text

Offline STT: Whisper speech recognition runs entirely in your browser via WebAssembly. No audio data is sent anywhere.

Transcribe audio recordings or live microphone input using OpenAI Whisper via Transformers.js — entirely offline.

Mode

Whisper Model

Language

Model details

Whisper Tiny — ~150 MB. Fastest, basic quality.
Whisper Base — ~300 MB. Fair quality.
Whisper Small (recommended) — ~500 MB. Good quality, best for clinical use.

All models run via WebAssembly (ONNX Runtime). First use downloads the model; subsequent runs use browser cache.

Record Audio

Or Upload Audio

Drop audio file here

.mp3, .wav, .ogg, .webm, .m4a, .flac

#	Start	End	Duration	Transcription
No entries yet. Press Record to start.

Index and sort unsorted DICOM data from any directory or network path. Smart scanning uses file size, naming, and folder structure heuristics to avoid opening every file.

Source Directory

Scan Settings

Smart Scan — sample groups of similarly-sized files instead of parsing each one Batch size:

Test mode — limit scan & sort to a small number of files for quick testing

Merge PDF documents

Combine several PDF files into one document. Everything stays on your computer — files are never uploaded anywhere.

1 Add your PDF files

Drop one or more .pdf files here, or browse

2 Put them in the right order

Use the arrows to move a file up or down. The top file becomes the first page.

3 Save the merged file

File name

Download models in advance so you can work offline when patients arrive. Inspect and manage all browser-stored data below.

Prepare for Offline Use

Download all models now while you have internet. Once cached, everything works in airplane mode.

Translation Model NLLB-200 · ~300 MB Translates between 200+ languages

Checking...

NER Model Multilingual PII NER · ~280 MB Detects names, addresses, IDs in 8+ languages

Checking...

Anonymization LLM Qwen3 1.7B · ~2 GB Detects and anonymizes PII via WebGPU

Checking...

Speech Model Whisper · ~500 MB Offline speech-to-text transcription

Checking...

OCR Engine Tesseract.js + English traineddata · ~25 MB Text recognition for scanned PDFs (used by PDF anonymization)

Checking...

PDF Engine PDF.js 4.7.76 · ~1.5 MB Reads and rasterizes PDFs (used by PDF anonymization & merge)

Checking...

App Updates

Force-refresh the app code without redownloading model weights. Useful after a new version is deployed.

Note: Private/incognito browser windows discard all caches when closed — including model weights. To keep models between sessions, use a normal window.

Personal Data — Where It Goes

Your documents, text and patient data are never written to disk or browser storage. They exist only in temporary JavaScript memory and are automatically removed.

You upload a file or paste text File is read into JavaScript memory (RAM) only

↓

AI model processes data locally Translation/anonymization runs inside your browser's CPU/GPU

↓

You download the result Output goes to your Downloads folder — you control it

↓

Data is cleared from memory Automatically on page close, or click the button below

Currently in memory

No personal data in memory

Auto-clears when you close or refresh this page

Auto-clears after 30 minutes of inactivity

Never written to Cache API, IndexedDB, localStorage, or cookies

Total storage used Scanning...

Cache API entries —

IndexedDB databases —

Cache API (Model Files)

Translation and NER models are cached here. These are safe AI model weights — no personal data.

Scanning...

IndexedDB (LLM Model Cache)

WebLLM/MLC caches Qwen model weights here. These are safe AI model data — no personal data.

Scanning...

This will remove all cached models. You will need to re-download them on next use.

Text Translation

Loading Translation Model...

Translation Complete

NER / detector

LLM

Medical Document

Mapping File Optional

Loading model...

Anonymizing...

Anonymization Complete

Detection Breakdown

NER Found

LLM Found

NER Filtered (False Positives Removed by LLM)

LLM Added Beyond NER

Entity Mapping

Preview

Upload Document

Or Paste Text

Loading model...

Generating Summary...

Summary Complete

Record Audio

Or Upload Audio

Loading Whisper...

Transcribing...

Transcription Complete

Session Log

Source Directory

Scan Settings

Sort to Organized Structure

Merge PDF documents

1 Add your PDF files

2 Put them in the right order

3 Save the merged file

Prepare for Offline Use

App Updates

Personal Data — Where It Goes

Currently in memory

Cache API (Model Files)

IndexedDB (LLM Model Cache)