Hacker News new | past | comments | ask | show | jobs | submit login

It remains to be asked, just why this causal, counterfactual, logical reasoning cannot emerge in a sufficiently scaled-up model trained on a sufficiently diverse real world data?

As far as we see, the https://www.gwern.net/Scaling-hypothesis continues to hold, and critics have to move their goalposts every year or two.




Neural networks, at the end of the day, are still advanced forms of data compression. Since they are Turing-complete it is true that given enough data they can learn anything, but only if there is data for it. We haven't solved the problem of reasoning without data, i.e. without learning. The neural network can't, given some new problem that has never appeared in the dataset, in a deterministic way, solve that problem (even given pretrained weights and whatnot). I do think we're pretty close but we haven't come up with the right way of framing the question and combining the tools we have. But I do think the tools are there (optimizing over the space of programs is possible, learning a symbol-space is possible, however symbolic representation is not rigorous or applicable right now)


I do think we underestimate compressionism[1] especially in the practically achievable limit.

Sequence prediction is closely related to optimal compression, and both basically require the system to model the ever wider context of the "data generation process" in ever finer detail. In the limit this process has to start computing some close enough approximation of the largest data-generating domains known to us - history, societies and persons, discourse and ideas, perhaps even some shadow of our physical reality.

In the practical limit it should boil down to exquisite modeling of the person prompting the AI to do X given the minimum amount of data possible. Perhaps even that X you had in mind when you wrote your comment.

1. http://ceur-ws.org/Vol-1419/paper0045.pdf


data isn't necessarily a problem for training agents. A sufficiently complex, stochastic environment is effectively a data generator - eg. alphago zero


Good point. This gets us into the territory of not just "explainable" models, but also the ability to feed into those models "states" in a deterministic way. This is a merger of statistical and symbolic methods in my mind -- and no way for us to achieve this today.


Why shouldn't we be able to just prompt for it, if our system models natural language well enough?

...

And anyway, this problem of structured knowledge IO has been more or less solved recently: https://arxiv.org/abs/2110.07178




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: