The point of a summarization model is if you have a thousand line change, it helps to have a one sentence explanation of what it is. The demo videos the author used here really don't do a good job communicating that, because the summary GPT-3 wrote for his one line commit was longer than the commit itself.
Right, and even if GPT-3 could summarize the thousand-line diff in a sensible way, without introducing any falsehoods, it would still be strictly worse than the developer writing a sentence explaining what they think they've accomplished with the commit.
It's just the same thing as with comments and "self-documenting code". The code tells you what (and if written carefully, it may be even somewhat effective at it). It can't tell you why. Neither can a GPT-3 summary of it.
I've had ChatGPT find some pretty esoteric bugs in ways that shocked me.
Like semi-jokingly asking it to "improve" some code thinking it'd come up with some non obvious control flow... then instead having it immediately point out a subtle bug along the lines of "the code sets flag A instead of setting flag B on <insert line>" flag B wasn't even unused, so it's not like a simple unused variable heuristic would have caught that.