Yeah I've been wondering about this too. Word on the street is that GPT4 is several times the size of GPT3.5. Yet I don't feel it's several times as good for sure.
Apparently there's a diminishing returns effect on ever enlarging the model.
I believe what they discovered was that 4 is an ensemble model, comprised of (8) GPT3.5s. Things may have changed or been found to not be true on this though.
Apparently there's a diminishing returns effect on ever enlarging the model.