They used GPT 3.5 so these findings are irrelevant.

lolinder · on July 7, 2024

The AI community is worse than JavaScript for acting like everything needs to be thrown out every six months.

Aside from the very important fact that GPT-3.5 is still far and away the most frequently used LLM model, it's not like GPT-4 has a completely different architecture with completely different characteristics. It's clearly better, but much of what they describe should generalize to LLMs as a whole (for example, knowledge cutoff dates matter a lot and these things likely have memorized a lot more than we thought they did).

p1esk · on July 7, 2024

It doesn’t matter what is the most commonly used model. If you want to make claims about LLM capabilities you better be using the best model available.