Hacker News new | past | comments | ask | show | jobs | submit login

You're close, but there’s an important nuance. The process isn't about "learning how to solve problems in general" in the broad sense. It's more specific: the neural network is trained to mimic the step-by-step process demonstrated by humans solving a specific problem.

The distinction is that the software doesn't autonomously derive general problem-solving heuristics from scratch. Instead, it observes examples of how humans solve problems procedurally and uses that to replicate similar reasoning. This is crucial because the step-by-step demonstrations give the model structure and guidance, which is different from learning a generalizable strategy for solving any kind of problem without those examples.

In essence, it's like a neural net learning to follow a recipe by watching a chef cook—rather than inventing its own recipes entirely from first principles.




Yes, except that I'm not so sure there is a clear distinction between following general instructions and generating new heuristics. It's just a difference in the level of abstraction there, and probably not even that one in any discrete sense, more like a continuum.

(Current) models may of course lack sufficient training data to act on a metalevel enough ("be creative problem solvers"), or they may lack deep enough representations to efficiently act in a more creative way. (And those two may be more or less the same thing or not.)


Up to a point, general instructions can be generated from a collection of specific examples by abstracting what is different between them, but it is not clear to me that abstraction is all you need to come up with novel methods.

This seems consistent with the main points in this paper: one to-the-point statement of the answer to a factual question is all you need [1], while, if you don't have an example of a chain of reasoning in which all of the parameters are the same as those in the prompt, more than one example will be needed.

The authors write "We falsify the hypothesis that the correlations are caused by the fact that the reasoning questions are superficially similar to each other, by using a set of control queries that are also superficially similar but do not require any reasoning and repeating the entire experiment. For the control queries we mostly do not observe a correlation." In the examples of control queries that they give, however, this just amounts to embedding the specific answer to the question asked into language that resembles an example of reasoning to a solution (and in the first example, there is very little of the latter.) The result, in such cases, is that there is much less correlation with genuine examples of reasoning to a solution, but it is not yet clear to me how this fact justifies the claim quoted at the start of this paragraph: if the training set contains the answer stated as a fact, is it surprising that the LLM treats it as such?

[1] One caveat: if the answer to a factual question is widely disputed within the training data, there will likely be many to-the-point statements presented as the one correct answer (or - much less likely, I think - a general agreement that no definitive answer can be given.) The examples given in figure 1 are not like this, however, and it would be interesting to know if the significance of individual documents extends to such cases.


it's exactly how we learn. many examples and then general principles. if you start with general principles, everybody drops out.


Not "exactly" how we learn. Humans learn through a combination of reinforcement learning (which is costly/risky/painful) and through observation of existing patterns and norms.

Better observation-based learning is a less expensive way of improving existing corpus-based approaches than trial-and-error and participating in an environment.


except that the careful observation comes late in the curriculum. children don't learn if you start out with the Stern Gerlach experiment. they sing ABCs.


The parent of any young child can tell you that they learn through lots of exploration and reinforcement - often to the worry and chagrin of caregivers. Indeed much of our job is to guide exploration away from excessively dangerous “research” activities (ex. locking away cleaning products).


As an ideal parent, you should give your kids access to activities that seem dangerous, without actually being all too dangerous.

Kids seem to have an internal dial for a desired level of perceived danger, and get up to weird stuff, if they don't get enough perceived danger.


Crucially, this is what MacIntyre's narrativity thesis is talking about:

If a university professor is giving a lecture on decentralized finance and forks into a recipe for chocolate chip cookies: crack two eggs, add a cup of flour, and fold in brown sugar prior to baking, it would break linearity.

A generalizable strategy for synthesizing LLMs differentiated by their training parameters is a tokenization is isolating data sets and then establishing a lattice in uniformity within the field of technics.


> A generalizable strategy for synthesizing LLMs differentiated by their training parameters is a tokenization is isolating data sets and then establishing a lattice in uniformity within the field of technics.

This comment appears to be incoherent and likely AI-generated text. Let me break down why:

1. While it uses technical-sounding terms related to machine learning (LLMs, tokenization, data sets), the way they're strung together doesn't make logical sense.

2. The grammar is incorrect: - "a tokenization is isolating" is not grammatically valid - The sentence structure breaks down in the middle with two "is" statements - The phrase "establishing a lattice in uniformity within the field of technics" is meaningless jargon

3. If we try to interpret what it might be attempting to say about LLMs (Large Language Models), the ideas don't connect in any meaningful way. "Synthesizing LLMs differentiated by their training parameters" could be trying to discuss creating different LLMs with varying parameters, but the rest doesn't follow logically.

4. The term "field of technics" is particularly suspicious - while "technics" is a real word, it's rarely used in AI/ML discussions and seems thrown in to sound technical.

This text shows common hallmarks of AI-generated content that's trying to sound technical but lacks real meaning - it uses domain-specific vocabulary but combines them in ways that don't make semantic sense, similar to how AI models can sometimes generate plausible-looking but ultimately meaningless technical text.


And that analysis is also LLM generated. It's turtles all the way down, folks.


spoken eerily similar to how chatgpt would put it :) https://chatgpt.com/share/674cd11d-a30c-8005-90a3-023d0c9c18...


> In essence, it's like a neural net learning to follow a recipe by watching a chef cook—rather than inventing its own recipes entirely from first principles.

Just like how a chef learns


A chef also learns through trial and error not just reading how others have cooked in the past and then copping their motions.

This is exemplified by how altitude has a meaningful impact but isn’t discussed for a given recipe.


a text LLM isn't going to learn by trial and error, it's not been given that sort of freedom. RLHF would be the llm version of trial and error - but it's like the chef is only allowed to do that for a few days after years of chef school and from then on, he has to stick to what he has already learnt.


Why isn't LLM pre-training based on next token prediction considered "trial and error"? It seems to fit that description pretty well to me.


Pre-training is based on a proxy for desired output not actually desired output. It’s not in the form of responses to a prompt, and 1:1 reproducing copyrighted works in production would be bad.

It’s the difference between a painter copying some work and a painter making an original piece and then get feedback on it. We consider the second trial and error because the full process is being tested not just technique.


a chef doesn't get feedback on his meal after picking up the spoon. he gets feedback when he or somebody else tastes the meal part way through and at the end.


There is more than one correct answer in reality, LLM pre-training just trains it to respond the same way as the text did.

Imagine if school only gave correct if you used exactly the same words as the book, that is not "trial and error".


I can tell you haven't been in a school in while. That is actually a pretty accurate description of what schools are like nowadays.


Pretty accurate != always, which is the point.


> it observes

Observe implies sentience that, without question, a neural net simply does not possess. "It" certainly 'records', or more specifically it 'maps', but there is no observer in sight (npi).

> mimic

LLM's do not mimic. The magic is mathematical and happening in the high dimensional space. If there is intrinsic underlying pattern and semantic affinities between process X (used in training) and process Y (used in application), it is very likely that both share proximity, possibly form, in some dimensions of the high dimensional model.


Define "observation". If it's just sensory and information processing then no, it does not require nor simply sentience.


There is a word for that: a 'recording'. There is no observer thus no observation.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: