Claude recently recommended me a great sounding book, with a citation of course. The only trouble I had was that the book did not exist.
To be fair I also made up a citation in 11th grade to fill out the citation for an essay I had to write. This was back before it was easy to double check things online.
> I also made up a citation in 11th grade to fill out the citation for an essay I had to write. This was back before it was easy to double check things online.
I love this comment. I also suspect that even if it were easy for your 11th grade teacher to check, they probably were not interested enough to do so.
Story Time: When I was in 4th grade back in the '70s, I had to write a book report: the book was a novel about astronauts traveling through space.
In my report, I lied about the plot because there was a romantic subplot between two of the astronauts... and my 4th grade brain didn't want to discuss anything so "disgusting."
I handed in my report and then spent the next two weeks in terror thinking that my teacher would read the book and realize that I lied about the plot.
Obviously, my 4th grade teacher had no interest in reading a space-travel book targeted to grade schoolers, so my lies went undetected.
Google Search's AI Overview just the other day confidently mis-summarized a source so badly that it came to the exact opposite conclusion to what the source actually contained:
Yes, AI Overview is a pretty weak model, but it somehow got "yes, that photo is AI" from an article explaining "not only is that photo not AI, here is the reporter who took the photo."
The other thing is that it is often hard to tell whether a model is talking about a source because the surrounding system has run a search and injected it into the prompt, or whether it's just freestyling based on its training data.
That’s because LLM’s generally don’t cite their sources. Web search is a tool outside of the LLM. Depending on the particular chat interface, there are any manner of tools in place to augment LLM capabilities/outputs, and they are constantly changing.
If one is trying to make an argument about the usefulness of LLMs, it’s irrelevant whether LLMs on their own can cite sources. If they can be trivially put into a system that can cite sources, that is a better measure of it’s usefulness.
I mean, it’s not trivial. There is a lot of work involved with enabling tool use at scale so that it works most of the time. Hiding that work makes it worse for the common user, because they aren’t necessarily going to understand the difference between platforms.
I agree that this is mostly OpenAI’s fault, though I also think people posting strong claims about LLMs online have a responsibility to know slightly more than the average user.
And at best it's the same as me asking my smart friend and copy/pasting their response to you, as if them citing sources puts the onus on you with rather than me to check the citations.
Except they regularly make up quotes and sources. Once ChatGPT gave me a "quote" from the Qt6 docs to support a particular claim; however, I was sceptical and looked at the link. ChatGPT not only made up the quote, it actually said the opposite of the linked docs. Not to mention that sometimes the links themselves are just hallucinations.
As I said,sometimes, especially if you ask some simple question that is pretty easily verifiable fact pn any search engine. Claude gave me nonsense links whole summer after some update and nothing says ChatGPT won’t do the same after some future ”improvement”. Besides, more you veer towards questions that ate not so cleacut (”I want to make an LLM application that mimicks sounds Brazilian sounds make running on open source model, how many parametres does it need and what model should I use and should I use React or Svelte for frontend”) more fuzzy the resukts. And more longer the the chat, more tighter its context window becomes and more it hallucinates.
Point being: no you cannot trust it withput double checking its information from elsewhere. Same as with anything else.
The whole point of a cited source is that you read the source to verify the claim. Amazing how many people in this thread seem to not let this little detail get in the way of their AI hate.
> The whole point of a cited source is that you read the source to verify the claim. Amazing how many people in this thread seem to not let this little detail get in the way of their AI hate.
I like that you read all the citations in your concrete example of how good chat gpt is at citations and chose not to mention that one of them was made up.
Like you either would have seen it and consciously chose not to disclose that information or you asked a bot a question, got a response that seemed right, and then trusted that the sources were correct and posted it. But there’s no chance of the latter happening though because you specifically just stated that that’s not how you use language models.
On an unrelated note what are your thoughts on people using plausible-sounding LLM-generated garbage text backed by fake citations to lend credibility to their existing opinions as an existential threat to the concept of truth or authoritativeness on the internet?
I use LLMs all the time and have since they first became so I don’t hate them. But I do know they are just tools with limitations. I am happy that ChatGPT has better sitarions these days, but I still do not trust it with anything important without double-checking several places. Besides, the citation itself can be some AI generated blog post with completely wrong information.
This tooks have limitations. Sooner we accept it,sooner we learn to better use them.
Says “Page Not Found”. From a technical standpoint how do you think that happened? Personally I think it is either the result of a hallucination or the chat bot actually did a web search, found a valid page, and then modified the URL in such a way that broke it before sending it to you.
At best, the sources cited by an LLM system would be a listing of the items used for RAG, or other external data sources that were merged into the prompt for the LLM. These items would ideally be appended to the response by a governing system around the LLM itself. I don't know of any major providers that do this right now.
The median case is having the LLM itself generate the text for the citation section, in which case there really is no mechanism tying the content of a citation to the other content generated. IF you're lucky and within the bounds the LLM was trained on, then the citation may be relevant, but the links are generated by the same token prediction mechanism as the rest of the response.
> Can you please at least look at any of the major offerings of the past three years before being both snarky and wrong?
All of the examples on that website are from the last three years.
Can you clarify about how I’m wrong about LLMs not reliably citing sources? Are the 490 examples of made up sources appearing in court filings not valid? Is the link you posted where you asked chatgpt how many people there are (that included a broken link in the sources) valid?
Except when they cite sources that do not say the thing that they attribute to the source, which is more often than not when I go to investigate sources.
I have never myself seen a situation where cited sources on Wikipedia did not back it up where that fact wasn't already noticed and called out by someone else. It is a frequent and common occurrence with LLMs.