Hacker News new | past | comments | ask | show | jobs | submit login

thats not a bad idea, very expensive though, and you end up with a pretty useless model in most regards.

A lot of the trusted benchmarks today are somewhat dynamic or have a hidden set.






That could happen. One would need to risk it to take the approach. However, if it was trained on legal data, then there might be a market for it among those not risking copyright infringement. Think FairlyTrained.org.

"somewhat dynamic or have a hidden set"

Are there example inputs and outputs for the dynamic ones online? And are the hidden sets online? (I haven't looked at benchmark internals in a while.)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: