Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Apparently Google is capable of doing cool things. DeepMind’s speculative sampling achieves 2–2.5x decoding speedups in LLM. That brings cost down significantly, without degradation in quality.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: