Data diversity: Preserving variety in data sets should aid machine learning

rokosbasilisk · on Dec 18, 2016

I believe they are using mcmcs at the core. markov chain multi carlos. this might be useful if you are wondering what it is http://mlwhiz.com/blog/2015/08/19/MCMC_Algorithms_Beta_Distr...

q_revert · on Dec 18, 2016

I think this is the paper, which oddly isn't linked in the article:

https://arxiv.org/abs/1509.01618

sidrajaram · on Dec 18, 2016

That seems to be a precursor to the work mentioned in the article. This is the one that was presented at NIPS this year: https://papers.nips.cc/paper/6182-fast-mixing-markov-chains-...

ultrafilter · on Dec 18, 2016

The article has a link in a sidebar:

https://arxiv.org/abs/1608.01008v2

opaqe · on Dec 18, 2016

Is there a more detailed paper describing the algorithm? The description is very vague in the article. When they pick the two points, is there an evaluation on how much "diversity" increases w/r/t each of the three possible operations, and that's how they choose?

edit: thanks @q_revert for linking the paper