Hacker Newsnew | past | comments | ask | show | jobs | submit | AlexCoventry's commentslogin

Mixture-of-Expert models benefit from economies of scale, because they can process queries in parallel, and expect different queries to hit different experts at a given layer. This leads to higher utilization of GPU resources. So unless your application is already getting a lot of use, you're probably under-utilizing your hardware.

Interestingly I can't get ChatGPT to help me find a video showing me how to disable the cellular modem on my Subaru 2024 Crosstrek. Time to do some old-fashioned research, I guess...

https://chatgpt.com/share/692cde57-0930-800e-b45f-7a41ca5c8e...


Who cares about what ChatGPT can't do? It can't make me a sandwich either.

"Process-oriented" verification has been a thing for a while in mathematical reasoning CoT. Google had a paper about it last year [1]. The key term to look for is "Process-reward model." I particularly like RL Tango [2].

[1] https://arxiv.org/abs/2406.06592

[2] https://arxiv.org/abs/2505.15034


I recommend reading his nephew's biography, Prof. He makes a strong case for why it was probably suicide.


That's partly because they are getting more reliable, though, just as WP did.


I think this is cool, but some performance benchmarks would really help to sell it.


> figuring how to get good product out of them

What have you figured out so far, apart from explicit up-front design?


Why is reading code harder than writing it?


I think it has to do with mental model. If you already know what to write and it is reasonably complex you'll have a mental model ready and can quickly write it down (now even faster as LLMs autocomplete 3-4 lines at a time). While reading someone else code you'll have to constantly map the code in your mind with code written and have to then compare quality, security and other issues.


Yeah, it's exactly this. Having to create a mental model from the code is much harder than having one and just writing it out.


I just tend to find LLM code output extremely to read, I guess. It tends to be verbose and do a lot of unnecessary stuff, but I can always get the point easily and edit accordingly.


I'd say just reading your own code from a few years back will be as hard as reading someone else's.


Manicheanimation


He should cite John Ousterhout, IMO. He's clearly influenced by Ousterhout's (excellent) work.


And I did it a few times :) He knows about the article, we talked about it.


Lucky to have an opportunity to chat with him! Did he have any specific feedback on your essay?



Heh, apologies. Should have Ctrl+F'd.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: