Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Normally, I don't think 1000 tokens/s is that much more useful than 50 tokens/s.

However, given that CoT makes models a lot smarter, I think Cerebras chips will be in huge demand from now on. You can have a lot more CoT runs when the inference is 20x faster.

Also, I assume financial applications such as hedge funds would be buying these things in bulk now.




> Also, I assume financial applications such as hedge funds would be buying these things in bulk now.

Please elaborate.. why?


I'm assuming hedge funds are using LLMs to dissect information from company news, SEC reports as soon as possible then make a decision on trading. Having faster inference would be a huge advantage.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: