Hacker News new | past | comments | ask | show | jobs | submit login
Data diversity: Preserving variety in data sets should aid machine learning (news.mit.edu)
50 points by upen on Dec 18, 2016 | hide | past | favorite | 5 comments



I believe they are using mcmcs at the core. markov chain multi carlos. this might be useful if you are wondering what it is http://mlwhiz.com/blog/2015/08/19/MCMC_Algorithms_Beta_Distr...


I think this is the paper, which oddly isn't linked in the article:

https://arxiv.org/abs/1509.01618


That seems to be a precursor to the work mentioned in the article. This is the one that was presented at NIPS this year: https://papers.nips.cc/paper/6182-fast-mixing-markov-chains-...


The article has a link in a sidebar:

https://arxiv.org/abs/1608.01008v2


Is there a more detailed paper describing the algorithm? The description is very vague in the article. When they pick the two points, is there an evaluation on how much "diversity" increases w/r/t each of the three possible operations, and that's how they choose?

edit: thanks @q_revert for linking the paper




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: