Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’ve found the experience pretty underwhelming so far. Maybe they’re under heavy load right now, but nearly every time I try to use it, it takes about 20 seconds before it automatically switches from 2.5 Pro to 2.5 Flash due to delays. Unfortunately, the Flash output just isn’t good enough.





I think it was with the anthropic guys on Dwarkeshs podcast, but really it could have been any of the other tech podcasts with a big name AI guest, but anyway they were talking about how orgs need to make big decisions about compute allocation.

If you need to do research, pre-training, RLHF, inference for 5-10 different models across 20 different products, how do you optimally allocate your very finite compute? Weight towards research and training for better future models, or weigh towards output for happier consumers in the moment?

It would make sense that every project in deepmind is in constant war for TPU cycles.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: