Transcription time estimator
Expected wall-clock to transcribe on-device using WhisperKit Large v3 Turbo (or SpeechAnalyzer on macOS 26+, which is ~40% faster again). Numbers are for steady-state thermal-nominal conditions.
| Chip | 15 min source | 30 min source | 60 min source | 120 min source | 240 min source | Throughput |
|---|---|---|---|---|---|---|
| M1 / M1 Pro | 2s | 4s | 8s | 15s | 30s | 8× |
| M1 Max / Ultra | 1s | 3s | 5s | 11s | 22s | 11× |
| M2 / M2 Pro | 1s | 2s | 5s | 9s | 18s | 13× |
| M2 Max / Ultra | 1s | 2s | 4s | 8s | 15s | 16× |
| M3 / M3 Pro | 1s | 2s | 4s | 7s | 14s | 17× |
| M3 Max | 1s | 1s | 3s | 5s | 11s | 22× |
| M4 / M4 Pro | 1s | 1s | 2s | 5s | 10s | 25× |
| M4 Max | 0s | 1s | 2s | 4s | 8s | 32× |
| M5 Pro | 0s | 1s | 2s | 3s | 6s | 40× |
Reading this table: a 60-minute podcast on an M2 Pro takes about 4:37 because the chip transcribes 13 hours of audio in one real-time hour.
SpeechAnalyzer bonus: on macOS 26+, enable the system framework and expect about 40% less. An M4 Max that runs WhisperKit at 32× effectively runs SpeechAnalyzer at ~45×.
Thermal note: fanless Macs (MacBook Air, iPad Pro) throttle after sustained transcription. A 4-hour podcast on an M1 Air will take longer than the table suggests once the chip hits 80°C.
Want the full cost comparison vs cloud transcription APIs? Read Why we picked WhisperKit over OpenAI's API.