Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can't fathom why they didn't just require the models to make available the training data itself. Sure you might need to fork some cash so they can ship you hard drives but surely being audited by someone anyone is better than none.



Training data may be licensed from third parties which don't allow redistribution.


Training data for a medical diagnosis model would likely include enough info to de-anonymize the info for some participants (age, sex, zip code, descriptions). I'm not sure what the answer should be but I'm uncomfortable with the medical training data being provided freely to the world.


If you give up your training data, you don’t have a product anymore.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: