I haven’t used faster-whisper so I can’t compare performance, but whisper.cpp does support cuda via CUBLAS, and it’s noticeably faster than the cpu version. I used it earlier this year to generate subtitles for 6 seasons of an old tv show I backed up from dvd that didn’t include subtitles on the disc.
Fwiw decent acceleration works on any avx2 compatible chipset. I get realtime speed for everything but the large models with a recent Ryzen system. The apple silicon is good but not as special as folks think!
If you have Nvidia hardware the ctranslate2 based faster-whisper is very very fast: https://github.com/guillaumekln/faster-whisper