Whisper Local
WhisperKit on-device
100% private. Audio never leaves your Mac. Works without an internet connection.
- Local
- Free
- Offline
Trigger recording from anywhere, transcribe with Apple, WhisperKit, Google Cloud Chirp 3, Deepgram, or ElevenLabs Scribe Realtime v2, then let Gemini polish the result before it pastes itself into whatever you're writing.
Local, cloud, and realtime engines share one workflow. Stay local for privacy, swap to the cloud for accuracy, or stream in real time — without leaving the app.
WhisperKit on-device
100% private. Audio never leaves your Mac. Works without an internet connection.
Online · Apple servers
Free and ready to use. Sends audio to Apple's recognition service when online.
Bring your own GCP project
Maximum precision. Uses Google Cloud Speech-to-Text with your own credentials.
Batch or real-time streaming
Nova-3 for high-accuracy batch transcription, or Flux Live for low-latency streaming with WAV backup.
Scribe batch or WebSocket
Use Scribe v2 for accurate batch transcription, or Scribe Realtime v2 for committed-text streaming.
SapoWhisper runs each transcript through Gemini 3.1 Flash-Lite on Vertex AI. Pick a mode, plug in your project context, and get a clean message instead of a raw dictation dump.
Create editable modes for Codex, Claude Code, Slack, issues, or any dictation flow.
Add product names, commands and frequent corrections so Deepgram and Gemini stop guessing.
One profile that explains your role and tools — applied to every AI mode without inventing details.
gemini-3.1-flash-lite uh send to the team like the latest mockups before lunch i think and also push the release branch to github and we can talk after standup
Sending the latest mockups to the team before lunch. Also pushing the release branch to GitHub. Let's catch up after standup.
Engines and AI handle the transcript. These are the everyday touches that make SapoWhisper feel like part of your Mac, not a separate tool.
Press ⌥ + Space (or your own shortcut) from any app — the recording overlay appears without stealing focus.
The polished transcript lands in your clipboard and drops directly where your cursor is, so dictation feels like typing.
Preferred microphone sync, mic test, gain controls, audio cues and auto-ducking that lowers system volume while you talk.
Swap the interface and the transcription language between Spanish, English or Auto — recognise both without restarting.
SapoWhisper keeps working after the paste. Search past transcripts, replay the original audio, pin important entries, and run the same clip through another engine when you want a better result.
Find older transcripts quickly instead of dictating the same thing twice.
Listen back before sharing, editing, or comparing transcription quality.
Re-process the same recording when accuracy matters more than speed.
Keep the original audio, compare engines, and export the result you actually want to keep.
Trigger, capture, transcribe, refine, paste. SapoWhisper hides the complexity so you can stay in your editor.
⌥ + Space (or your own) brings up the recording overlay from any app.
Preferred mic, gain control, auto-ducking and pause / resume when you need it.
Apple, WhisperKit, Google Cloud Chirp 3, Deepgram Nova-3 / Flux Live, or ElevenLabs Scribe Realtime v2.
Gemini 3.1 Flash-Lite on Vertex AI cleans up the transcript without inventing details.
Auto-paste lands the result where your cursor is. Everything stays searchable.
Pick the transcription engine that fits the moment, let Gemini polish the result, and start writing with your voice from anywhere on macOS.