It's not going to go away, though. If anything, the future looks like lower qual...

bko · on April 11, 2023

There doesn't need to be new material. Just mechanism to segregate between good and bad and that will always exist if humans are interesting.

LLMs use reinforcement learning once they have a base understanding of words. It's like how modern chess engines don't analyze games, just play against themselves.

LLMs go even further where they train a model to judge what people would deem to be high quality, so it's another layer

lagrange77 · on April 11, 2023

> LLMs use reinforcement learning > LLMs go even further where they train a model to judge what people would deem to be high quality, so it's another layer

You're describing RLHF (Reinforcement learning with Human Feedback) used by ChatGPT, right.

I wouldn't say that LLMs use it, but RLHF is used to create a higher level model on top of a LLM.

alwaysbeconsing · on April 11, 2023

That works because chess is a closed world. Without input from outside, two LLMs training against each other would most likely become raving lunatics -- just as two humans locked in a dark cell together would do.

jgust · on April 11, 2023

Something something 100 monkeys with a typewriter? I'm expecting some margin of error introduced into these models that produces a surreal fever dream era of AI generated content on the internet.