Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A lot of the "science" we do is experimenting on bunches of humans, giving them surveys, and treating the result as objective. How many places can we do much better by surveying a specific AI?

It may not be objective, but at least it's consistent, and it reflects something about the default human position.

For example, there are no good ways of measuring the amount of technical debt in a codebase. It's such a fuzzy question that only subjective measures work. But what if we show the AI one file at a time, ask "Rate, 1-10, the comprehensibility, complexity, and malleability of this code," and then average across the codebase. Then we get measure of tech debt, which we can compare over time to measure if it's rising or falling. The AI makes subjective measurements consistent.

This essay gives such a cool new idea, while only scratching the surface.



> it reflects something about the default human position

No it doesn't. Nothing that comes out of an LLM reflects anything except the corpus it was trained on and the sampling method used. That definitionally true, since those are the very things it is a product of.

You get NO subjective or objective insight from asking the AI about "technical debt" you only get an opaque statistical metric that you can't explain.


If you knew that the model never changed it might be very helpful, but most of the big providers constantly mess with their models.


Even if you used a local copy of a model, it would still just be a semi-quantitative version of “everyone knows ‹thing-you-don't-have-a-grounded-argument-for›”


Their performance also varies depending on load (concurrent users).


Dear god does it really? That’s very funny.


Why are you surprised? It’s a computational thing, after all.


It’s not that crazy, just the architecture of differently quantized models and so on that you’d need to do that is impressive considering.


The models are the same, it's the surrounding processing like "thinking" iterations that are adjusted.


That only works for LRMs no? Not traditional LLM inference.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: