Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I have encountered other issues where a slight change in phrasing leads to totally different types of output, despite the meaning of the two phrasings being the same to a normal person.

I feel like a human might do the same thing, the words might have slightly different connotations which makes them think of different ideas. We can’t test that with a human because we can’t reset them to a previous state whereas with an LLM you can



You can do statistical analysis of groups of humans and look for divergences, and then create a test which tests for all of the diverged criteria. A human might respond out of the norm for a few questions, but would fit closer to the human group overall.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: