It seems obvious to me that LLMs wouldn't be able to find examples of every sing...

pfisherman · 2024-12-01T21:58:12 1733090292

I very much agree with the perspective that LLMs are not suited for “reasoning” in the sense of creative problem solving or application of logic. I think that the real potential in this domain is having them act as a sort of “compiler” layer that bridges the gap between natural language - which is imprecise - and formal languages (sql, prolog, python, lean, etc) that are more suited for solving these types of problems. And then maybe synthesizing the results / outputs of the formal language layer. Basically “agents”.

That being said, I do think that LLMs are capable of “verbal reasoning” operations. I don’t have a good sense of the boundaries that distinguish the logics - verbal, qualitative, quantitative reasoning. What comes to my mind is the verbal sections of standardized tests.

eru · 2024-12-02T01:01:30 1733101290

> I think that the real potential in this domain is having them act as a sort of “compiler” layer that bridges the gap between natural language - which is imprecise - and formal languages (sql, prolog, python, lean, etc) that are more suited for solving these types of problems. And then maybe synthesizing the results / outputs of the formal language layer. Basically “agents”.

Well, if you do all that, would you say that the system has a whole has 'reasoned'? (I think ChatGPT can already call out to Python.)

MacsHeadroom · 2024-12-02T03:29:40 1733110180

The system as a whole has reasoned twice over, verbally and then logically.

eru · 2024-12-02T05:48:17 1733118497

Well, pfisherman seems to disagree with that use of the word reasoning.

joe_the_user · 2024-12-02T01:10:07 1733101807

I can believe that they're doing some form of extrapolation to create novel solutions to posed problems

You can believe it what sort of evidence are you using for this belief?

Edit: Also, the abstract of the Apple paper hardly says "corruption" (implying something tricky), it says that they changed the initial numerical values

og_kalu · 2024-12-02T14:08:48 1733148528

Changing numerical values doesn't do anything to impact the performance of state of the art models (4o, o1-mini, preview)

The only thing that does is the benchmark that introduces "seemingly relevant but ultimately irrelevant information"

fragmede · 2024-12-03T11:29:40 1733225380

> It's a term coined by LLM companies to evoke an almost emotional response on how we talk about this technology.

Anthropomorphizing computers has been happening long before ChatGPT. No one thinks their computer is actually eating their homework when they say that to refer to the fact that their computer crashed and their document wasn't saved, it's just an easy way to refer to the thing it just did. Before LLMs, "the computer is thinking" wasn't an unuttered sentence. Math terms aren't well known to everybody, so saying Claudr is dot-producting an essay for me, or I had ChatGPT dot-product that letter to my boss, no one knows that a dot product is, so even if that's a more technically accurate verb, who's gonna use it? So while AI companies haven't done anything to promote usage of different terms than "thinking" and "reasoning", it's also because those are the most handy terms. It "thinks" there are two R's in strawberries. It dot-products there are two R's in strawberries. It also matrix multiplies, occasionally softmaxes; convolves. But most people aren't Terence Tao and don't have a feel for when something's softmaxing because what even does that mean?

ucefkh · 2024-12-01T21:49:47 1733089787

Totally, these companies are pushing towards showcasing their AI models as self thinking and reasoning AI while they are just trained of a lot of amount of data in dataset format which they extrapolate to find the right answer.

They still can't think outsider their box of datasets