Same vein - YouTube most (all?) llm integrations just scrape the transcript. I -think- google's aistudio does more but I'm unsure.
I mean I get it bulk video processing would be crazy expensive, but at least mention you're only analyzing the transcript especially if you're a paid product.
Whisper does do text to speech but yes, nearly all just read off the subtitles. There's a video by f4mi on YouTube where she tricked the summarizing bots with off-screen captions filled with nonsense.
I mean I get it bulk video processing would be crazy expensive, but at least mention you're only analyzing the transcript especially if you're a paid product.