I have the same observation. I've been able to improve things I just didn't have the energy to do for a while. But if you're gonna be lazy, it will multiply the bad.
Trying to get some third party hardware working with raspi
The hardware provider provides 2 separate code bases with separate documentation but only supports the latest one.
I literally had to force feed the newer code base into ChatGPT, and then feed in working example code to get it going, else it constantly reference the wrong methods.
If I just kept going Code / output / repeat it would maybe have stumbled on the answer but it was way off.
This is one of several shortcomings I’ve encountered in all major LLMs. The llm has consumed multiple versions of SDKs from the same manufacturer and cannot tell them apart. Mixing up apis, methods, macros, etc. Worse is that for more esoteric code with fewer samples, or more wrong than right answers in the corpus means always getting broken code. I had this issue working on some NXP embedded code.
Human sites are also really bad about mixing content from old and new versions. SO to this day still does not have a version field you can use to filter results or more precisely target questions.
I see those as carefully applied bandaids. But maybe that’s how we need to use AI for now. I mean we’re burning a lot of tokens to undo mistakes in the weights. That can’t be the right solution because it doesn’t scale. IMO.
Yesterday I searched how to fix some windows issue, google AI told me to create a new registry key as a 32 bit value and write a certain string in there.