Hacker News new | past | comments | ask | show | jobs | submit login
The Importance of a Data Acquisition Team (laszlo.substack.com)
1 point by xLaszlo on Oct 19, 2020 | hide | past | favorite | 1 comment



Organising labelling is a core part of productionised ML and the Domain Team has the most knowledge to do it. But are they the best suited to do it?

Our experience shows that because incentives differ in production and off-line labelling, the Domain Team might create biased (subjective) dataset in Human-In-The-Loop settings.

Adding an extra team that was trained by the Domain Team but can only influence end-to-end performance through the deployed model helped to keep the training dataset objective while improving the productionised pipeline and helping the Domain Team.

First time HN poster, please share your recommendations in the comments. We are working on a "Machine Learning Product Manual" ebook about best practices to productionise and operate ML from a product perspective. Follow me on twitter at https://twitter.com/xLaszlo for updates.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: