It does, quite often. Not only that, as you describe. But it does.
For example, I asked it what my most cited paper is, and it made up a plausible-sounding but non-existent paper, along with fabricated Google Scholar citation counts. Totally unhelpful.
Right, i think it's a question of how to use this tool in its current state, including prompting practice and learning its strengths. It can certainly be wrong sometimes, but man, it is already a game changer for writing, coding, and i'm sure other disciplines.
If you're a robotresearcher, maybe try getting it to whip up some ...verilog circuits or something? I don't know much about your field or what you do specifically, but tasks like regular expressions or specific code syntax it is absolutely brilliant at, whatever the equivalent to that is in hardware. ...I've only ever replaced capacitors and wired some guitar pickups.
> it made up a plausible-sounding but non-existent paper, along with fabricated Google Scholar citation counts
I ran into a similar issue: I asked it for codebases of similar romhacks to a project i'm doing, and it provided made up Github repos with completely unrelated authors for romhacks that do actually exist: non-existent hyperlinks and everything.
Now, studying the difference in GPT generations, it seems like more horsepower and more data solves alot of GPT problems and produces emergent capabilities with the same or similar architecture and code. The current data points to this trend continuing. I find it both super exciting and super ...concerning.
This seems like the perfect test, because it's something that does have information on the internet - but not infinite information, and you know precisely what is wrong about the answer.
It does, quite often. Not only that, as you describe. But it does.
For example, I asked it what my most cited paper is, and it made up a plausible-sounding but non-existent paper, along with fabricated Google Scholar citation counts. Totally unhelpful.
It also can produce very useful things.