There is only one thing about GPT that is mysterious: what parts of the model don't match a pattern we expect to be meaningful? What patterns did GPT find that we were not already hoping it would find?
And that's the least exciting possible mystery: any surprise behavior is categorized by us as a failure. If GPT's model has boundaries that don't make sense to us, we consider them noise. They are not useful behavior, and our goal is to minimize them.
And that's the least exciting possible mystery: any surprise behavior is categorized by us as a failure. If GPT's model has boundaries that don't make sense to us, we consider them noise. They are not useful behavior, and our goal is to minimize them.