Aqua Voice vs Superwhisper summary
Aqua Voice and Superwhisper are both dictation apps for Mac, Windows, and iPhone. The difference is architecture: Superwhisper runs speech models on-device for privacy and offline use, while Aqua runs its own model, Avalon, in the cloud for real-time speed and published accuracy (97.3% on technical terms; #1 proprietary on the OpenASR leaderboard at its October 2025 debut). Superwhisper offers a $249.99 lifetime license; Aqua Pro is $8/month.
Why we built our own speech model
When we started dictating code and AI prompts, the off-the-shelf speech models most dictation tools build on kept mangling technical terms. They're good for everyday speech, but they weren't trained for the vocabulary we use all day.
Generic, off-the-shelf models we tested would turn "Claude Code" into "clawed code," "git checkout" into "get check out," and "kubectl" into various things.
So we built our own model, Avalon. Not because we wanted to. Training speech models is hard and expensive. But off-the-shelf models weren't good enough for how we work.
The numbers
We built AISpeak, a benchmark of real clips of developers and AI researchers talking naturally, and measured both models on it:
Model
Accuracy
Avalon (ours)
97.3%
Whisper Large v3
65.1%
On standard speech benchmarks (OpenASR), Avalon also outperforms Whisper Large v3 across the benchmark suite. Whisper is one of the open models you can run on-device in Superwhisper; this compares Avalon to it on the vocabulary that matters most to developers and power users. Try Aqua Voice free.
What speech model does Superwhisper use?
Superwhisper doesn't run a single speech model. It lets you choose: a selection of models run on-device (including open Whisper-based models), or you can connect a cloud model with your own API key (BYOK). That flexibility is the point, and it's why your accuracy and speed depend on which model you pick and how powerful your device is.
We took the opposite path. We trained one model, Avalon, specifically for the speech developers and technical users produce daily, and we run it server-side so it can be larger and more capable than what fits on a laptop. The tradeoff is honest: your audio is processed in the cloud rather than staying on your device.
You can see exactly how Avalon performs on the OpenASR leaderboard and access it directly through our API. When a model's benchmarks are public, you can verify the accuracy claims yourself.
Cloud real-time vs on-device privacy
Superwhisper processes audio on your device by default. That's genuinely better for privacy, your audio never leaves the machine, and it works with no internet connection. On-device, the more accurate models run slower the bigger they get, so speed depends on your hardware; you can instead bring your own cloud model (BYOK), at the cost of sending audio off-device.
Aqua processes audio in the cloud. The cost is that you need an internet connection and your audio is sent to our servers (we're SOC 2 Type II certified, and Team plans add an org-wide Privacy Mode). The benefit is real-time streaming on a large model that no laptop can match, plus screen-context awareness, the app reads the active window so it knows you're saying programming terms in your editor and casual text in Messages.
That's the trade in one line: Superwhisper keeps everything on-device; Aqua trades some of that on-device privacy for materially better accuracy and real-time speed.
Where Superwhisper has the edge
Superwhisper runs on-device by default, so your audio stays on your device and works offline. If keeping audio fully local is non-negotiable, that's a real advantage Aqua doesn't offer.
It also sells a one-time lifetime license ($249.99), supports more languages (100+ vs our 49), and lets you bring your own model (BYOK). It has a devoted Mac power-user community. If any of those is what you're optimizing for, Superwhisper is the honest pick.
Where we're better
Published, verifiable accuracy
Avalon ranked #1 among proprietary systems on the public OpenASR leaderboard at its October 2025 debut (#6 overall), and scores 97.3% on our AISpeak benchmark. You can check the numbers yourself. Superwhisper's accuracy depends on the model you pick and isn't published.
Real-time speed
Avalon runs server-side, so it streams text as you speak on a model larger than a laptop can run. On-device tools are gated by your device's hardware, the more accurate the local model, the slower it goes.
Screen-context awareness
The desktop app reads the active window to match what you're doing, code-aware in your editor, casual in Messages. On-device dictation treats every app the same.
Technical and AI vocabulary
If you dictate code, prompts, terminal commands, or framework names, Avalon was trained for exactly that. It's the core of why we built our own model.
A model developers can build on
The same model is available through the Avalon API, OpenAI-compatible, at $0.39/hour of audio. Superwhisper relies on whichever model you bring rather than shipping one of its own.
Aqua Voice vs Superwhisper: Feature-by-feature comparison
Aqua Voice
Superwhisper
Speech model
Avalon (own, purpose-trained)
Your choice (on-device or BYOK cloud)
Processing
Cloud (server-side)
On-device by default
Public benchmark
#1 proprietary on OpenASR (Oct 2025 debut)
Not published
Technical-term accuracy
97.3% (AISpeak, our benchmark)
Depends on chosen model
Real-time output
✅ Streams as you speak
On-device, hardware-dependent
Screen-context awareness
✅
❌
Works offline
❌ (cloud)
✅ (on-device)
Mac/Windows/iPhone
✅
✅
Languages
49
100+
Lifetime license
❌
✅ $249.99
Pro price
$8/mo ($96/yr)
~$8.49/mo ($84.99/yr)
Developer API
Avalon API ($0.39/hr)
BYOK only (no first-party model)
Security model
Cloud, SOC 2 Type II
On-device (audio stays local)
Best for
Accuracy, real-time, technical/AI users
On-device privacy, lifetime pricing
How to decide
Pick Aqua Voice if: You dictate code, AI prompts, or technical terms; you want accuracy you can verify against public benchmarks; you want real-time streaming and screen-context awareness on a model bigger than your laptop can run.
Pick Superwhisper if: Keeping your audio fully on-device is non-negotiable; you want to work offline; you'd rather pay once for a lifetime license; you need a language outside our 49; or you want to bring your own model.
Try both. They're both free to start.
If you work primarily with code or AI tools and care about accuracy you can verify, you'll notice Avalon quickly, especially on technical terms and real-time streaming.
If keeping your audio fully on-device matters more than anything, or you'd rather pay once for a lifetime license, Superwhisper is the better fit.
Spend 10 minutes with each, dictating what you actually type day-to-day. The right choice will be obvious.
Or just ask ChatGPT.
Frequently asked questions
Is Aqua Voice better than Superwhisper?
For accuracy and speed, yes. Aqua Voice runs its own model, Avalon, in the cloud with real-time streaming, screen-context awareness, and independent accuracy benchmarks you can verify. Superwhisper runs general models on-device, which is great for offline privacy but leaves accuracy and speed up to your hardware and the model you pick. If you dictate code, AI prompts, or technical terms, Aqua Voice is the stronger choice.
What's the difference between Aqua Voice and Superwhisper?
Aqua Voice trained and runs its own speech model, Avalon, in the cloud, which enables real-time streaming, screen-context awareness, and a larger, more accurate model than most laptops can run locally. Superwhisper runs a choice of general models on your device. Aqua optimizes for accuracy and speed you can verify; Superwhisper optimizes for keeping audio on-device. Both work on Mac, Windows, and iPhone.
Is Aqua Voice more accurate than Superwhisper?
Aqua Voice publishes its accuracy and Superwhisper doesn't. Avalon ranked #1 among proprietary systems on the public OpenASR leaderboard at its October 2025 debut (#6 overall) and scores 97.3% on AISpeak, our benchmark of AI and technical terms. Superwhisper's accuracy depends on whichever model you run and isn't published, so when you need accuracy you can verify, especially on code, commands, and AI terms, Aqua Voice is the safer bet.
What speech model does Aqua Voice use?
Aqua Voice runs Avalon, a model we trained ourselves for technical and AI vocabulary, instead of bundling a general off-the-shelf model. It runs server-side, so it can be larger and more capable than a model that fits on a laptop, and we publish its results on the public OpenASR leaderboard. Developers can use the same model through the Avalon API.
Is Aqua Voice private and secure?
Yes. Aqua Voice is SOC 2 Type II certified, so our security practices are independently audited, and Team plans add an org-wide Privacy Mode. Aqua processes audio in the cloud to run the Avalon model rather than on your device, so most users get independently audited cloud security plus accuracy they can verify. If you need audio to never leave your machine at all, a fully on-device tool is the alternative.
How much does Aqua Voice cost?
Aqua Voice is free to start with 1,000 words. Pro is $8/month billed annually ($96/year) for unlimited dictation and the Avalon model, Team is $12/seat/month, and Enterprise is custom. Students get 70% off with a .edu email, about $2.40/month. It's a competitive price for a purpose-built model with published accuracy benchmarks.
Does Aqua Voice work on Windows and iPhone?
Yes. Aqua Voice runs on Mac, Windows, and iPhone, and one subscription covers all your devices. On iPhone it installs as a custom keyboard, so you can dictate directly into any app. Aqua Voice doesn't support Android.
Is Aqua Voice good for coding and AI prompting?
Yes, it's what Aqua Voice is built for. Avalon was trained on the speech developers and AI users actually produce, reaching 97.3% accuracy on technical terms like programming keywords, framework names, and CLI commands. The desktop app also reads the active window for context, code-aware in your editor and casual in Messages, and the same model is available to developers through the Avalon API.
