My point is that they all generalize *better* from larger datasets. Size is rela...

YeGoblynQueenne · on Oct 26, 2018

>> My point is that they all generalize better from larger datasets.

Like I say, this is not the case. There are learning algorithms that generalise so well from few data that their performance can improve only marginally with increasing amounts of data, or not at all.

I appreciate that you probably have no idea what I'm talking about. I certainly don't mean linear regression.

darawk · on Oct 27, 2018

> Like I say, this is not the case. There are learning algorithms that generalise so well from few data that their performance can improve only marginally with increasing amounts of data, or not at all.

Erm, no. Not unless they are solving the problem perfectly.

> I appreciate that you probably have no idea what I'm talking about. I certainly don't mean linear regression.

I work in the field. I'm quite certain i'm familiar with whatever it is that you think you're talking about.

The category of algorithms that attempt to learn things from few examples is called 'One shot learning'. It's usually in the context of image classification, but it applies equally well elsewhere. These algorithms still learns better from more data.

Do feel free to share an example of an algorithm that generalizes better from less data. I'll wait.

YeGoblynQueenne · on Oct 27, 2018

>> Erm, no. Not unless they are solving the problem perfectly.

Well, yes, that's what I mean.

I gave an example here a while ago, of how a Meta-Interpretive Learning algorithm, Metagol, can learn the aⁿbⁿ grammar perfectly from 4 positive examples:

https://news.ycombinator.com/item?id=17837055

That's typical of Metagol, as well as other algorithms in Inductive Logic Programming, the broader sub-field of machine learning that MIL belongs to.

>> Do feel free to share an example of an algorithm that generalizes better from less data. I'll wait.

To clarify, my claim is that there are algorithms that learn adquately from few data and therefore don't "need" more data. Not that less data is better.

That said, there are theoretical results that suggest that a larger hypothesis space increases the chance of the learner overfitting to noise. So what is really needed in order to improve generalisation is not more data, but more relevant data. Then again, that is the subject of my current PhD so I might just be interpreting everything through the lens of my research (as is typical for PhD students).

You work in the field? What do you do?