I saw an LLM having this kind of problem when I was doing some testing a ways ba...

spockz · 2024-10-31T06:16:21 1730355381

How much does this align with how we learn math? We kind of instinctively learn the answers to simple math questions. We can even at some point develop an intuition for things like integrating and differentials. But the moment we are asked to explain why, or worse provide a proof, things become a lot harder. Even though the initial answer may be correct.

larodi · 2024-10-31T09:51:41 1730368301

I definitely don’t learn math by means of gradient descents.

We can possibly say math is not learned, but a mental models of abstractions are developed. How? We dunno, but what we do know is we don’t learn by figuring the common features between all previously seen equations only to guess them later…

Mind operates on higher and higher levels of abstractions building on each other in a much fascinating way, very often not with words, but with structure and images.

Of course there are people with aphantasia, but i really fail to see how any reasoning happens in purely language level. Someone on this forum also noted - in order to reason one needs an ontology to facilitate the reasoning process. LLMs don’t do ontologies…

And finally, not least though, LLM and ML people in general seem to equate intuition to some sort biased.random(). Well intuition is not random, and is hard to describe in words. So are awe and inspiration. And these ARE part of (precondition to, fuel for) humanity’s thought process more that we like to admit.

shotnothing · 2024-10-31T15:23:25 1730388205

> I definitely don’t learn math by means of gradient descents.

https://physoc.onlinelibrary.wiley.com/doi/10.1113/JP282747

larodi · 2024-10-31T23:10:56 1730416256

The fact it (is suggested / we are led to believe / was recently imlied ) the neurons can be explained to be doing something like it on the underlying layer still says little about the process of forming ontological context needed for any kind of syllogism.

mplewis · 2024-10-31T19:17:57 1730402277

Humans learn skills like basic mathematics by reasoning about their environment and building internal models of problems they’re trying to solve. LLMs do not reason and they cannot model their environment.

ajuc · 2024-10-31T11:39:54 1730374794

It's not thinking, it compressed the internet into a clever, lossy format with nice interface and it retrieves stuff from there.

Chain of thought is like trying to improve JPG quality by re-compressing it several times. If it's not there it's not there.

Jerrrrrrry · 2024-10-31T17:14:04 1730394844

  >It's not thinking



  >it compressed the internet into a clever, lossy format with nice interface and it retrieves stuff from there.

Humans do both, why can't LLM's?

  >Chain of thought is like trying to improve JPG quality by re-compressing it several times. If it's not there it's not there.

More like pulling out a deep-fried meme, looking for context, then searching google images until you find the most "original" JPG representation with the least amount of artifacts.

There is more data to add confidently, it just has to re-think about it with a renewed perspective, and an abstracted-away higher-level context/attention mechanism.

danenania · 2024-10-31T17:50:05 1730397005

> Chain of thought is like trying to improve JPG quality by re-compressing it several times. If it's not there it's not there.

Empirically speaking, I have a set of evals with an objective pass/fail result and a prompt. I'm doing codegen, so I'm using syntax linting, tests passing, etc. to determine success. With chain-of-thought included in the prompting, the evals pass at a significantly higher rate. A lot of research has been done demonstrating the same in various domains.

If chain-of-thought can't improve quality, how do you explain the empirical results which appear to contradict you?

mplewis · 2024-10-31T19:19:09 1730402349

The empirical results like OP’s paper, in which chain of thought reduces quality?

danenania · 2024-10-31T19:50:31 1730404231

The paper is interesting because CoT has been so widely demonstrated as effective. The point is that it "can" hurt performance on a subset of tasks, not that CoT doesn't work at all.

It's literally in the second line of the abstract: "While CoT has been shown to improve performance across many tasks..."

easyThrowaway · 2024-10-31T14:17:00 1730384220

I have no idea how accurate it actually is, But I've had the process used by LLM described as the following: "Think of if like a form of UV Mapping, applied to language constructs rather than 3D points in space, and the limitations and approximations you experience are similar to those emerging when having to project a 2D image over a 3D surface."

Eisenstein · 2024-10-31T13:44:04 1730382244

These kind of reductive thought-terminating cliches are not helpful. You are using a tautology (it doesn't think because it is retrieving data and retrieving data is not thinking) without addressing the why (why does this preclude thinking) or the how (is it doing anything else to generate results).

lucianbr · 2024-10-31T16:51:58 1730393518

> If it's not there it's not there.

There is nothing in the LLM that would have the capability to create new information by reasoning, when the existing information does not already include what we need.

There is logic and useful thought in the comment, but you choose not to see it because you disagree with the conclusion. That is not useful.

Eisenstein · 2024-10-31T17:20:59 1730395259

I'm sorry but generating logic from tautologies is not useful. And the conclusion is irrelevant to me. The method is flawed.

bongodongobob · 2024-10-31T14:23:54 1730384634

Maybe if you bury your head in the sand AI will go away. Good luck!

lucianbr · 2024-10-31T16:56:52 1730393812

This is basically a reformulation of "have fun staying poor!". Even contains the exclamation mark.

Those people sure showed us, didn't they? Ah, but "it's different this time!".

ianbicking · 2024-10-31T15:12:23 1730387543

It would be interesting to think about how it got it wrong. My hunch is that in the "think step by step" section it made an early and incorrect conclusion (maybe even a subtly inferred conclusion) and LLMs are terrible at walking back mistakes so it made an internally consistent conclusion that was incorrect.

A lot of CoT to me is just slowing the LLM down and keeping it from making that premature conclusion... but it can backfire when it then accidentally makes a conclusion early on, often in a worse context than it would use without the CoT.

fxnn · 2024-11-01T08:24:03 1730449443

Maybe it needs even smaller steps, and a programmatic (i.e. multi prompt) habit to always double-check / validate the assumptions and outcomes.

not_a_bot_4sho · 2024-10-31T14:53:27 1730386407

I always found it interesting how sorting problems can get different results when you add additional qualifiers like colors or smells or locations, etc.

Natively, I understand these to influence the probability space enough to weaken the emergence patterns we frequently overestimate.

Jerrrrrrry · 2024-10-31T17:08:07 1730394487

The model is likely to had already seen the exact phrase from its last iteration. Adding variation generalizes the inference away from over-trained quotes.

Every model has the model before it, and it's academic papers, in it's training data.

Changing the qualifiers pulls the inference far away from quoting over-trained data, and back to generalization.

I am sure it has picked up on this mesa-optimization along the way, especially if I can summarize it.

Wonder why it hasn't been more generally intelligent, yet.

dev_0 · 2024-10-31T06:27:14 1730356034

From Claude:

I'll rank those three fruits from largest to smallest:

1. Grapefruit 2. Orange 3. Blueberry

The grapefruit is definitely the largest of these three fruits - they're typically around 4-6 inches in diameter. Oranges are usually 2-3 inches in diameter, and blueberries are the smallest at roughly 0.5 inches in diameter.

mromanuk · 2024-10-31T12:07:40 1730376460

chatGPT, from smaller to largest: Blueberry Orange Grapefruit