Yes, it’s particularly bad when the information found on the web is flawed.
For example, I’m not a domain expert, but I was looking for an RC motor for a toy project and OpenAI had happily tried to source a few, with Deep Research. Only the best candidate it had picked contained an obvious typo in the motor spec (68 grams instead of 680 grams), which is just impossible for a motor of specified dimensions.
Right, but LLMs are also consuming AWS product documentation and Terraform language docs, some things I have read a lot of and they’re often badly wrong on things from both of those domains, which are really easy for me to spot.
This isn’t just “shit in, shit out”. Hallucination is real and still problematic.
I had it generate a baseball lineup the other day, it printed out a list of the
13 kids names, then said (12 players). Just straight up miscounted what it was doing, throwing a wrench to everything else it was doing beyond that point.
For example, I’m not a domain expert, but I was looking for an RC motor for a toy project and OpenAI had happily tried to source a few, with Deep Research. Only the best candidate it had picked contained an obvious typo in the motor spec (68 grams instead of 680 grams), which is just impossible for a motor of specified dimensions.