Question 1

How accurate is FancyCaptions on Dutch and Flemish?

Accepted Answer

On our 2026-06-11 benchmark of 81 Dutch clips, our default engine (Scribe v2) reached a 25.3% word error rate — ahead of Whisper's 28.7% on the same set, and matched only by Speechmatics. Lower WER is better. That's strong Dutch and Flemish recognition relative to the common Whisper baseline; we publish the date and clip count so the figure is auditable rather than a round marketing number.

Question 2

What is word error rate (WER) and why not just quote 99%?

Accepted Answer

Word error rate is the share of words the transcript gets wrong — substitutions, deletions and insertions divided by the reference word count — so lower is better. We report WER rather than a flashy 99% accuracy figure because WER is the standard, auditable measure, and a single round percentage hides how a model behaves on accented, fast or noisy speech. We give you the real number with its date and dataset so you can judge it yourself.

Question 3

Is FancyCaptions more accurate than Whisper on Dutch?

Accepted Answer

Yes, on our benchmark. Scribe v2 scored 25.3% WER versus Whisper large-v2's 28.7% on the same 81 Dutch clips measured 2026-06-11 — a 3.4-point improvement, almost entirely from fewer dropped words (Scribe deletes 8.6% vs Whisper's 11.9%). Switching our Dutch default from Whisper to Scribe v2 was the single biggest free accuracy win in the study.

Question 4

How accurate is FancyCaptions in English?

Accepted Answer

We benchmarked Dutch specifically, not English, so we don't publish an English WER from this study. As a rough guide, high-resource languages like English typically land around 78–80% word accuracy (roughly 19–22% WER) on real-world short-form audio, and English usually transcribes more cleanly than Dutch. The honest differentiator is Dutch and Flemish, where most tools struggle — plus the render parity, which is exact.

Question 5

What was the dataset and method?

Accepted Answer

We used 84 real short-form speech clips (81 Dutch plus 3 with translated, non-matching captions), split 42 dev / 42 test with an interleaved seed. The reference ("ground truth") was a mainstream auto-caption tool's output, scored leniently with micro-averaged WER, and no LLM correction was applied — this measures the raw ASR engine. The harness caches audio and per-config transcripts so re-runs are cheap and reproducible.

Question 6

Beyond transcription, how do I know the captions render correctly?

Accepted Answer

Transcription accuracy is one half; the other is whether the on-screen caption matches what you previewed. FancyCaptions renders captions frame-for-frame matched to the leading paid caption tool, measured at zero divergence across 1,647 reference frames (552 + 1,095 across 37 styles). So the animated style you pick in the editor is exactly what exports — the preview and the export run the same render path.

Engine / config	WER (lower is better)	Notes
Scribe v2 + keyterm glossaryOurs	25.0%	Our best config — customer glossary biasing
Scribe v2 (our default for NL/Flemish)Ours	25.3%	Recommended default, temperature 0
Speechmatics (enhanced)	25.3%	Co-leader; configured alternative
Scribe v1	26.2%	~1pt behind v2
AssemblyAI Universal-2	27.0%
Whisper large-v2	28.7%	The previous prod default — worst here

Config	Dev WER	Test WER
Scribe v2 + keyterms	22.6%	27.3%
Scribe v2	22.9%	27.6%
Speechmatics	24.4%	27.0%
Whisper	27.4%	30.0%

How accurate is FancyCaptions on Dutch & Flemish?

Which engine is most accurate on Dutch?

What's the dataset and method?

Is the result overfit — and what does 25% actually mean?

What about English — and why Dutch is the differentiator

Frequently asked questions

Explore FancyCaptions

Don't take our word for it — test your own clip