Transcription time estimator

Expected wall-clock to transcribe on-device using WhisperKit Large v3 Turbo (or SpeechAnalyzer on macOS 26+, which is ~40% faster again). Numbers are for steady-state thermal-nominal conditions.

Chip15 min source30 min source60 min source120 min source240 min sourceThroughput
M1 / M1 Pro2s4s8s15s30s8×
M1 Max / Ultra1s3s5s11s22s11×
M2 / M2 Pro1s2s5s9s18s13×
M2 Max / Ultra1s2s4s8s15s16×
M3 / M3 Pro1s2s4s7s14s17×
M3 Max1s1s3s5s11s22×
M4 / M4 Pro1s1s2s5s10s25×
M4 Max0s1s2s4s8s32×
M5 Pro0s1s2s3s6s40×

Reading this table: a 60-minute podcast on an M2 Pro takes about 4:37 because the chip transcribes 13 hours of audio in one real-time hour.

SpeechAnalyzer bonus: on macOS 26+, enable the system framework and expect about 40% less. An M4 Max that runs WhisperKit at 32× effectively runs SpeechAnalyzer at ~45×.

Thermal note: fanless Macs (MacBook Air, iPad Pro) throttle after sustained transcription. A 4-hour podcast on an M1 Air will take longer than the table suggests once the chip hits 80°C.

Want the full cost comparison vs cloud transcription APIs? Read Why we picked WhisperKit over OpenAI's API.