So it turns out that AI is just like another function, inputs and outputs, and t...

jprete · 2025-01-05T15:26:39 1736090799

The Bitter Lesson claimed that the best approach was to go with more and more data to make the model more and more generally capable, rather than adding human-comprehensible structure to the model. But a lot of LLM applications seem to add missing domain structure until the LLM does what is wanted.

Philpax · 2025-01-05T17:11:47 1736097107

The Bitter Lesson states that you can overcome the weakness of your current model by baking priors in (i.e. specific traits about the problem, as is done here), but you will get better long-term results by having the model learn the priors itself.

That seems to have been the case: compare the tricks people had to do with GPT-3 to how Claude Sonnet 3.6 performs today.

do_not_redeem · 2025-01-05T15:37:39 1736091459

The Bitter Lesson pertains to the long term. Even if it holds, it may take decades to be proven correct in this case. Short-term, imparting some human intuition is letting us get more useful results faster than waiting around for "enough" computation/data.

mbaytas · 2025-01-05T15:31:37 1736091097

Improving model capability with more and more data is what model developers do, over months. Structure and prompting improvements can be done by the end user, today.

syndicatedjelly · 2025-01-05T18:57:30 1736103450

Not trying to nitpick, but the phrase "AI is just like another function" is too charitable in my opinion. A function, in mathematics as well as programming, transforms a given input into a specific output in the codomain space. Per the Wikipedia definition,

    In mathematics, a function from a set X to a set Y assigns to each element of X exactly one element of Y.[1] The set X is called the domain of the function[2] and the set Y is called the codomain of the function.[3]

Not to call you out specifically, but a lot of people seem to misunderstand AI as being just like any other piece of code. The problem is, unlike most of the code and functions we write, it's not simply another function, and even worse, it's usually not deterministic. If we both give a function the same input, we should expect the same input. But this isn't the case when we paste text into ChatGPT or something similar.

owenpalmer · 2025-01-06T04:48:14 1736138894

LLMs are deterministic. It's just that the random seed is hidden from you, but is still an input.

int_19h · 2025-01-05T20:22:02 1736108522

LLMs are literally a deterministic function of a bunch of numbers to a bunch of numbers. The non-deterministic part only comes when you apply the random pick to select a token based on the weights (deterministically) computed by the model.

shahzaibmushtaq · 2025-01-05T15:27:30 1736090850

You got that 100% right. The title should be "The day I told (not taught) AI to read code like a Senior Developer".