FWIW, AssemblyAI has great trasncript quality in my experience, and they support...

BasilPH · on March 1, 2023

We're using AssemblyAI too, and I agree that their transcription quality is good. But as soon as Whisper supports world-level timestamps, I think we'll seriously consider switching as the price difference is large ($0.36 per hour vs $0.9 per hour).

sebzim4500 · on March 1, 2023

Both of those prices strike me as quite high, given that Whisper can be run relatively quickly on commodity hardware. It's not like the bandwidth is significant either, it's just audio.

dgacmu · on March 2, 2023

It's pretty great from my perspective. I've been creating little supplemental ~10 minute videos for my class (using descript; i should probably switch to OBS), and the built in transcription is both wonderful (that it has it at all and is easy to fix) and horrible (the number of errors is very high). I'd happily pay a dime to have a higher quality starting transcription that saves me 5 minutes of fixing...

graderjs · on March 2, 2023

Try my app: https://apps.apple.com/app/wisprnote/id1671480366

It has great quality transcription from video and audio (in English only sorry if that's not you!). Uses Whisper.cpp plus VAD to skip silent / non-speech sections which introduce errors normally. Give a try let me know what you think! :)

ronyfadel · on March 2, 2023

A plug here but check out https://vidcap.app/

It’s based on a finetuned Whisper and you’d get unlimited transcriptions for $4.99/month

graderjs · on March 2, 2023

Why do you need Word-level timestamps? I don't understand what that's for...