Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The core point in this article is that the LLM wants to report _something_, and so it tends to exaggerate. It’s not very good at saying “no” or not as good as a programmer would hope.

umm, it seems to me that it is this (tfa):

     But I would nevertheless like to submit, based off of internal
     benchmarks, and my own and colleagues' perceptions using these models,
     that whatever gains these companies are reporting to the public, they
     are not reflective of economic usefulness or generality.
and then couple of lines down from the above statement, we have this:

     So maybe there's no mystery: The AI lab companies are lying, and when
     they improve benchmark results it's because they have seen the answers
     before and are writing them down.


[this went way outside the edit-window and hence a separate comment] imho, state of varying experience with llm's can aptly summed in this poem by Mr. Longfellow

     There was a little girl,
        Who had a little curl,
     Right in the middle of her forehead.
        When she was good,
        She was very good indeed,
     But when she was bad she was horrid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: