Borges is one of my favourite writers. Having some stranger try to think like him to conceptualize something for public consumption rubs me the wrong way. If Borges was around, I'd love to hear what he had to say, but invoking his name and works like this feels off
The obvious parallel between Borges and AI is his Quixote story: a character endeavors to write Don Quixote, but spontaneously, word for word, without copying; to become cervantes in a sense, and to train their style so that they can produce it almost by accident. This sort of makes the work their own, and is a typical argument used by ai enthusiasts when AI regurgitates copywrited work--it isn't storing the work, it's becoming a thing that can spontaneously produce it. But this imho cheapens the story and the parallel isn't as strong as it could be
One of the first things I did when someone showed me a pre-Chat version of GPT-3 was try to get it to speak in the voice of Borges. I had the same feeling of interesting but inappropriate.
I was really happy to see this pop up and I'm glad someone went through this as a thought experiment but you make a good point that didn't initially occur to me.
I feel like Borges' secular scifi-adjacent mysticism looses a lot of what makes it most meaningful when it's imitated or dissected academically.
That said it does feel like Borges would probably be into the idea of being imitated.
I also tried this as soon as I got access to GPT-4 at Shopify's expense. "You are the celebrated magical realist writer Jorge Luis Borges. Write an essay about Shopify".
What it produced was more of an essay written by a talented undergraduate about Borges, rather than Borges.
I'm sure Borges would've found LLMs fascinating and one can only dream what stories and essays would've been written.
Collected Fictions[0] is a wonderful group of Borges stories that includes the ones mentioned in this article.
There are some amazing short story collections out there if you are the kind of reader that has a hard time staying with an entire novel. Ted Chiang has a couple[1][2] collections of stories that feel very Borges-like.
Thanks, I will probably read that collection. Nice to see it includes the stories The Zahir and The Book of Sand which feel just as relevant as the ones mentioned in the paper.
It has been a long time since I read Borges, but I vaguely remember a story about a man about to die by firing squad, who pauses time and lives in an eternal moment by looking at a bee or something? I'm probably way off, but I really liked it.
My favourite Ted Chiang story is the Tower of Babel story, what a rich world and satisfying pay-off.
Ok so the paper presents a central metaphor for reasoning about LLMs.
The metaphor: a book with all possible conversations/ human writings ever, and a [good] llm is finding the spot in the book that exactly matches the context and reads from the book as a response.
Certainly if you've experimented with a model that hasn't been fine-tuned (eg via rlhf) this metaphor will be resonant.
Is it useful?
(How does it help me understand LLMs with different capabilities? How does it help me understand models with different fine tunings?)
I would have thought the most relevant Borges story to LLMs would be Funes, the memorious.
Funes can remember everything perfectly, every detail that he has ever seen or heard in his life. However, he cannot really think or understand what he has seen, because understanding requires forgetting, it requires generalization.
LLM's can only think insofar as they need to generalize over their inputs. If it's just a memorizing parrot, that is not a good LLM.
No it doesn't contradict what you explicitly said. It is intended to contradict whatever motivation you had to post a comment that wasn't actually productive. Data is valuable without needing attached argument.
At the risk of self-promotion, this is my take on Borges and AI... which is a song/video made by AI (jukebox and various previous-generation image generators), and based on Borges' fantastic essay "A New refutation of time."
The authors' own intention is to "understand.. LLMs and their
connection to AI through the imagery of Jorge Luis Borges, a master of 20th century literature, forerunner of magical realism, and precursor to postmodern literature." In that spirit, I throw my hat into the ring too!
That was my first thought as well. An LLM seems more like a search engine for the library only instead of indexing through topics it tries to find understandable non-sense in the sea of non-sense.
The difficulty with them is that they can only find things similar to what they've been trained on.
Lacking a curated index it's not going to find the information you need or want; it's going to find things that seem most like what others have seen or wanted.
I thought of https://en.wikipedia.org/wiki/On_Exactitude_in_Science and its relation to language, which is discussed by Baudrillard's Simulacra and Simulation. Large Language Models are the efforts of mapmakers to encapsulate reality in a way that is ultimately futile.
Interesting how this work received an homage in The Name of the Rose, up to the name of the librarian, Jorge de Burgos. (I hated the way they pronounced the name in the movie, /'iorge/ instead of the correct Spanish way /'xorxe/)
Sorry to veer slightly off-topic, but can anyone familiar with academia explain why such a literary exercise gets published on Arxiv as a research paper?
What is scientific or research-driven about it? How is this different from a long-form opinion or literary essay except for the fact that it's written with a paper-like style and voice?
I'm baffled. Is this just because humanities professors need to show they're published as well and they need to get a score for tenure, or something like that?
Leon Bottou isn't a humanities professor, but a ML researcher. In fact, not just any ML researcher but arguably one of the ML researchers who most anticipated the current DL scaling era.
Which is not to say that he necessarily has anything worthwhile to say about 'Borges and AI' but I'm going to at least give it a read to see if there's something I might want to know 20 years from now. :)
Well the point stands, though, considering that the two papers you linked clearly read like papers.
My question was mostly candidly naive and quite honest: what makes a paper a paper, when the content is something merely akin to a literary essay?
I guess the answer is "the author". :D
arXiv is for researchers in different science and mathematics subcommunities to post pre-prints, surveys, reviews, lecture notes, manuscripts, documentation, etc... but also historical/archival research, philosophy papers, and meta-essays, as long as they are relevantly targeted to the subcommunity.
> Neither truth nor intention plays a role in the operation of a perfect language model. The machine merely follows the narrative demands of the evolving story. As the dialogue between the human and the machine progresses, these demands are coloured by the convictions and the aspirations of the human, the only visible dialog participant who possesses agency.
This is a really good way of thinking about these models. It reminds me of the recent-ish story where a reporter got really creeped out Bing's OpenAI powered chatbot (https://www.nytimes.com/2023/02/16/technology/bing-chatbot-m...). Reading that, I had thought the bot was relatively easily led into a narrative the reporter had been setting up. In a conversation between actual people who have their own will and agency, you don't get to see one leading the other around by the nose so completely.
Reframing the problem as one of picking through the many threads of potential fictions to evolve a story makes it easier to explain what happened in that particular case.
This is interesting but such a shame to miss/skip "Funes, the memorious." It will prove, I think, to be quite resonant in the future. But it is a damning parable and probably people just don't want hear that right now...
I just don't understand how people can attribute consciousness of some sort to the LLM but then with that belief not feel absolutely terrible for it and for what we do to it! I just think of poor poor Funes...
There is a reason Borges's Library of Babel contained all combinatorially possible texts, with almost all of them being pure gibberish. Borges was wise enough to understand that the following is meaningless, even for a story about a magic library:
"Imagine a collection that does not only contain all the texts produced by humans, but, well beyond what has already been physically written, also encompasses all the texts that a human could read and at least superficially comprehend."
To be clear this is a horrifically dishonest metaphor about LLMs. IMO the most glaring flaw in the technology is that they can't handle new ideas which don't appear in the training set. It is true that ChatGPT doesn't deal with this use case very often because it mostly handles trivialities. But it does mean that this entire argument is navel-gazing speculation.
The bigger problem is that the entire idea of "all texts a human could superficially comprehend" is meaningless, and the paper proceeds to reason based off this utter fallacy. The beauty of Borges's Library of Babel was that he realized that humans are capable of "superficially comprehending" any text, even if it was created by a uniform random ASCII generator. This is the basis of numerology, and why Borges's story included superstitious cult behavior of people destroying and/or sanctifying "meaningful" gibberish. If we have a good enough reason to find meaning in text, we'll find it. Humans don't actually rely on symbolic reasoning, we just use that for communication and organization: give us the symbols and we will reason about them, using cognition which is far too squishy to fit in a book. It's especially dangerous when the symbols obey human grammar and imitate social tones of authoritativeness, mysticism, etc.
And then there's...this:
"The invention of a machine that can not only write stories but also all their variations is thus a significant milestone in human history."
I am not a writer. But speaking as a homo sapien, it is genuinely insulting to call ChatGPT a machine that can write "all variations" of a story. This paper needed to be reviewed by a serious writer or philosopher before being put on the arXiv.
What are some examples of "new ideas"? I'm having a hard time imagining an idea that can't be expressed as a combination of existing concepts.
Better concepts can arise when we make discoveries about reality (which takes experimentation), but there's a lot more juice to squeeze from the concepts we currently have.
"an idea that can't be expressed as a combination of existing concepts."
The problem is that if an LLM hasn't been pretrained on the specific idea, it won't have a grasp of the what the correct concepts are to make the combination. It will be liable to substitute more "statistically likely" concepts, but since that statistic is based on a training set where the concept didn't exist, its estimation of "likely" is flawed.
One good example is patents: https://nitter.net/mihirmahajan/status/1731844283207229796 LLMs can imitate appropriate prose, but really struggle to maintain semantic consistency when handling new patents for inventions that, by definition, wouldn't have appeared in the training set. But this extends to almost any writing: if you are making especially sophisticated or nuanced arguments, LLMs will struggle to rephrase them accurately.
(Note that GPT-4 is still extremely bad at document summarization, even for uninteresting documents: your Q3 PnL number is not something that appeared in the training set, and GPT-4 is liable to screw it up by substituting a "statistically likely" number.)
In my experience GPT-3.5 is extremely bad at F#: although it can do simple tasks like "define a datatype that works for such-and-such," it is much less proficient at basic functional programming in F# than it is Haskell - far more likely to make mistakes, or even identifiably plagiarize from specific GitHub repos (even my own). That's because there's a ton of Functional Programming 101 tutorials in Haskell, but very few in F#. I am not sure about GPT-4. It does seem better but I haven't tested it as extensively.
The problem is that if an LLM hasn't been pretrained on the specific idea, it won't have a grasp of the what the correct concepts are to make the combination
They cannot be expected to produce useful new ideas because the ideas exist in lacunae in their probabilities: despite the extant possible novel combination of ideas (which isn’t the only option for new ideas: neologisms exist), the LLM has never seen it and so will (probabilistically) never produce it because it is equivalent to nonsense.
The exception to this is if the new ideas are somehow present in the structure of language and are internalized and/or presented in an emergent form.
I've been eyeing some Borges books at a bookstore that is about 50 meters from me right now, coincidentally.
The company that maintains my favorite technology is named after a Borges story. I tried reading Borges a few years ago and found it insufferably uninteresting, but maybe now, with a pinch of external motivation, I'll enjoy him more.
I don't see the connection. I was a heavy B. reader in my day. But I remember him mentioning a Chesterton story where the machine eats its master. B. introduces me to Chesterton, to the Sartor Resartor, to the Bible, to Cthullu --- and then I can't even read enough English. Now, long after I've made B.'s break --- that it's all right, it's necessary --- I see how great his influence is in almost everything, bc culture is not a package of flooring you can buy one a day. It's a big warehouse of everything you can't buy, but in one lot.
Kafka wrote a little story like that: I won't quote it. They let them choose between being kings or being messengers for the kings. Because they were children, they all chose to be messengers for the kings, and now they were running all over the world carrying messages that nobody understood. Well, that was the Internet, wasn't it?
> “it can kill you based on its own desires and there’s nothing you can do about it”
haha, that's an original take, but makes sense after Terminator and Hal
wondering if these movies have caused untold external consequences to humanity in its adoption of AI just to sell a few tickets
to make a parallel, anti-vaxxers did their damage and caused many lives to be lost, similarly these stories, which are no better, can make people have a bad start with AI and sabotage their futures, or stall the benefits of AI from everyone else
I have been in “AI” since 1998 when I was writing A* route planning for npcs in this new cool engine called Unreal.
The only thing that has been consistent in all these years is that nobody thinks it’s AI unless it’s literally like Arnold Schwarzenegger in the terminator. I mean I’m not even exaggerating, it’s so ridiculously predictable that the goalposts for AI move the second whatever that particular technology becomes ubiquitous
So for example, hog sift, surf etc. along with localization algorithms like slam type systems we’re so thoroughly in research when I started that they were considered a pillar of the field of AI. Now literally, no one would consider those AI because they do not use deep convolutional networks.
So just like Marvin Minsky said AI is a suitcase term that doesn’t fucking mean anything. As somebody who’s been doing it for so long I’m used to it but it’s still annoying.
So I’m just building the terminator and the counter terminator so we can move on.
> Tesler's Theorem (ca. 1970). My formulation of what others have since called the “AI Effect”. As commonly quoted: “Artificial Intelligence is whatever hasn't been done yet”. What I actually said was: “Intelligence is whatever machines haven't done yet”. Many people define humanity partly by our allegedly unique intelligence. Whatever a machine—or an animal—can do must (those people say) be something other than intelligence.
>wondering if these movies have caused untold external consequences to humanity in its adoption of AI just to sell a few tickets
What are you saying? This isn't like when The Simpsons made fun of nuclear power and depicted it as doing impossible things. AGI is a hypothetical technology and we don't yet know what it could be capable of or even if it's feasible.
>to make a parallel, anti-vaxxers did their damage and caused many lives to be lost, similarly these stories, which are no better, can make people have a bad start with AI and sabotage their futures, or stall the benefits of AI from everyone else
Any idea can change a person's mind in one direction or another. Yours is an argument against the exchange of ideas in general. "Since hearing an idea could cause a person to $DO_BAD_THING, exchanging ideas (for example, by talking to people with $WRONG_OPINION, or by consuming fiction) is bad."
I thought "surely, when their lives will be at stake, people will do the prudent thing and trust doctors", but no, we were not that smart. Some ideas can be inflicting self-harm and continue getting support even in the face of grave consequences.
But you're not arguing against any particular idea. You're arguing against ideas based on how they change people's minds. But, a person's mind is changed by idea dependent of their personality and on the contents of their mind when they hear that idea, so in principle any idea can change a person's mind in any direction. Some people hear "vaccines cause autism" and conclude "I should not vaccinate my children", while others conclude "this country needs better education". Some people reach "I should not vaccinate my children" after hearing some other idea. Some people see The Terminator and think "I should work to prevent the advancement of AI", while others think "I should go into AI to prevent this from happening", and yet others think "ha, what a silly movie". Some people will reach "I should work to prevent the advancement of AI". So, like I said, you argument is one against culture as a whole. If the fact that hearing an idea will convince people of X opinion is a good reason to stop the spread of that idea, then it's also a good reason to stop the spread of all ideas.
Eh, when words lose functionality they either fall out of use or change meaning.
AI basically means things brains and computers both do, but this is only a useful term when brains do those things better than computers. Usually once computers definitively surpass brains we've moved on to just calling that computing.
Maybe that won't be the case and the term "AI" will either solidify as a broad category, or fall out of use, but it also might continue to refer to that-which-is-left-to-do, the things we're still better at than computers.
The obvious parallel between Borges and AI is his Quixote story: a character endeavors to write Don Quixote, but spontaneously, word for word, without copying; to become cervantes in a sense, and to train their style so that they can produce it almost by accident. This sort of makes the work their own, and is a typical argument used by ai enthusiasts when AI regurgitates copywrited work--it isn't storing the work, it's becoming a thing that can spontaneously produce it. But this imho cheapens the story and the parallel isn't as strong as it could be
Thanks for all the downvotes