Anyone making the argument that computers/LLMs can only create mediocre content, and can’t (or it will take a long time to) create content that humans will find exceptional, needs to go back and read the commentary re: chess bots and go bots over the past ten or twenty years.
We went from “computers can’t beat humans” to “okay, computers can beat humans, but they play like computers” to “computers are coming up with ideas humans never thought of that we can learn from” in about twenty years for chess, and less than five years for go.
That’s not a guarantee that writing, music, art, and video will follow a similar trajectory. But I don’t know of a valid reason to say they won’t.
Does anyone here have an argument to distinguish the creative endeavor of, say, writing from that of playing go?
go is a game with an obvious score function which can be used to construct a loss, well defined moves, and total visibility of the board. It is less obvious how to write a score for creativity in art or music, nor does it have well defined bounds on what is considered a legitimate construct of either. Just because computing hardware lets you multiple matrices faster does not mean we have the means to solve all problems.
> go is a game with an obvious score function which can be used to construct a loss, well defined moves, and total visibility of the board
This is literally the opposite of true, and the main reason go computers were getting destroyed by humans for almost twenty years after deep blue took down Kasparov.
Yes, technically. But the broader point is true. Go is a game with well-defined win and loss conditions that can be automatically evaluated.
This is critical for game-clock-eons of unsupervised self-play, which by most accounts is how AlphaGo (and other systems like AlphaZero) made the leap to superhuman levels of play.
But it is entirely different from subjective endeavors like writing, music, and art. How do you score one automatically generated composition vs another? Where is the loss function?
Stipulating up front that this is a question for a lead scientist at OpenAI: I could see a scoring function looking at essays in the New York Times vs. the National Enquirer and finding a way to generalize from there. Similarly for the top 40 hit songs vs <everything else>.
Completely backwards. There is no obvious score function for Go. That's how AlphaGo broke through, it was able to figure out a scoring method to actually accurately gauge how well it was doing so it could learn and improve.
If what you are saying is true, how does anyone know the game is over? there is a clear win condition. I know that just like with chess, knowing the current vAlue of the board is difficult, but win or loss is clear.
It is often the case that beginning go players don’t know when to quit. The question becomes more subtle as the players increase in skill. It is never as simple as “checkmate” — think more in terms of “mate in three”, except it’s more like “mate in 5-10” in several locations across the board.
Is there anything to the notion that in Go, success and failure are concrete, objective, and more or less easy to measure (or at least measured along the same kind of rules)? While it is computationally intractable to iterate through future moves to an end state, it’s still relatively easy to understand how well you’re doing at any point, and you measure that in basically the same way every game.
For some parts of language, that’s true: there’s grammar, there’s syntax, there’s patois, there’s argot—all these things seem accountable to words’ collective frequency within articulable groups of speakers, more-or-less-fully knowable on their own, and with success metrics that evolve but that do so through collective processes that models can measure and calibrate to. And indeed the models are great at those aspects of language.
“Succeeding” at writing is more than just “saying it well,” it’s also “having something worth saying” and “being worth listening to.” The second point is where things seem to get hazier for computable models. For sure there’s a set of facts that are more or less constant about the world, and well-reported. Science, repackaging history that’s already been done, lurid tales of crime—the stuff podcasts are made of! Not to mention the vast sea of data that sensor networks and automated research can produce—vast reservoirs of subtle truth that humans struggle to begin to mine for insight! It makes complete sense that this is computable stuff, and that computed writing might well be worth learning from.
But important writing—classically, anyway—seems to involve communicating new or idiosyncratic knowledge, and often reveals some of the process of developing it. The podcast Serial, for example, was a smash hit specifically because it didn’t rely on things that were part of the record—and because it reminded people how contingent memory and “truth” are. Bob Woodward writes things that are shamelessly tinted with Bob-Woodward-worldview, but people reveal important and true things only to Bob Woodward because they trust who he is and how he’s behaved for a lifetime (prominent longtime investigative journalist in the US, on the national security beat). Nassim Taleb seems to come up around here: in something like Antifragile his project wasn’t necessarily about new facts but about interpreting them in contrarian fashion and grouping those contrarian insights to synthesize a new theory.
Which brings us to the third component: “being worth listening to.” Writing is an act of communication: the writer matters. A parent hangs its child’s crayon drawing on the fridge not because it’s “authentic to the style of the kids’-crayon-drawing mode of visual art,” not because it’s novel or informative or even true-to-life, but because it came from a person they love. A “Dear John” letter devastates a soldier because it comes from a person with outsized part in their life and identity. Chinese publishers’ booths at trade shows are wall-to-wall translations of The Governance of China because it’s politically unwise not to. My favorite writers feel fresh: you feel elements of their personality come through. People have a special fetish for true crime—not that there’s any lack of fictitious crime to read about, but the fact that it happened to real humans potentiates the drama for these readers. It’s this aspect that I have a hard time understanding as computable (or commoditizable, I guess… are those similar phenomena?).
Already we seem to be drawing these distinctions in our collective reaction to LLM-stuff. We can’t wait to get hallucinations under control so we can chuck in gigantic boring contracts and internal wikis and financial reports, and get out comprehensible insight—but we roll our eyes at the tsunami of empty slop that’s overtaken Google results. We giggle at AI ventriloquism like this Neuro character [0], but die a little inside every time we read anodyne LLM-ish promotional copy and sameish AI art. First-level customer support seems like a perfect role for a chatbot—“turn it off and on again,” but nicely!—but people on the receiving end hate it [1] even for that task well-suited to it.
I’m only a layperson of course, but I wonder if any of those distinctions might be fruitful? Some of it I guess sums up to the old writing advice “show, don’t tell”—are there examples of machine writing showing promise in that way?
> it’s still relatively easy to understand how well you’re doing at any point
This isn’t true, and is actually a large part of why go computers were getting destroyed by humans almost twenty years after deep blue took down Kasparov. There were articles as recent as about 2012 despairing that computers would ever “get” go.
That said, relative to grading an essay, I’d tend to agree, go is easier. But that said, if the goal is to find the edge, so to speak: to figure out what “mundane” is and then go a bit beyond, that seems eminently possible for a computer to do.
We went from “computers can’t beat humans” to “okay, computers can beat humans, but they play like computers” to “computers are coming up with ideas humans never thought of that we can learn from” in about twenty years for chess, and less than five years for go.
That’s not a guarantee that writing, music, art, and video will follow a similar trajectory. But I don’t know of a valid reason to say they won’t.
Does anyone here have an argument to distinguish the creative endeavor of, say, writing from that of playing go?