I don't really ever want to read answers from GPT to questions that I didn't knowingly myself ask GPT. If GPT can write a commit message from you, don't write it at all and let me ask it that if that's what I want. It may be a positive to you to spend a few seconds less on commit messages but it's a net negative for the world for it to become polluted with vast amounts of flatly incorrect text with no knowledge as to its provenance. I'd rather have no commit message than one that I can't trust whether it's written by the same human that wrote the code or not.
Put another way, you asking GPT for stuff that it learned from Stack Overflow: good. Using it to post to Stack Overflow: bad.
For me the point of this demo is that even a good commit message is often redundant information.
As programmers we learn that adding a comment like:
// The limit is 16
to
const SOME_LIMIT = 16
is bad because is redundant information that serves no purpose to the reader and can easily misalign in the future.
So what's a good commit message for changing this limit? Ideally we want to describe why we've changed it but this information isn't always available so even when we're avoiding redundant comments we often use redundant commit messages like "increased SOME_LIMIT" to make browsing through history easier for others.
As we do not need to provide this information (it is already in the code), it seems like a reasonable idea for an AI to help us provide it.
I don't think the situation is comparable. The comment is redundant because typically you see the commented code right next to it, so reading the code is about as much effort as reading the comments.
In contrast, commit messages often stand alone: If you browse the history, you only see the messages, but now a large number of them; if a commit changes more than one file, the commit message has to sum up the changes from all files.
In all those contexts, a simple, high-level description of what has changed can be enormously helpful.
> In all those contexts, a simple, high-level description of what has changed can be enormously helpful.
Sure, but you are leaving out the point of the original reply -- the GPT-written commit messages are not trustworthy. They will look convincing, but they are likely to have errors.
> Ideally we want to describe why we've changed it but this information isn't always available
I struggle to imagine situation in which this is the case. Surely, even in the worst case of you being told to make a particular change with no explanation given, you can at least drop a "increased from 5 at a request of ${name of your boss}", or "increased from 5, see ticket #${ticket number}" in a comment, and/or a commit message.
Something that's standard at the company I work for is commit messages always having the ticket number at the start, and it helps figure out why something was changed so much more than a commit message.
Ex of a recent change I saw, but anonymized a bit.
PROJ-12345: Added preview flag to video player
PROJ-12345 in Jira:
When a preview of a video is playing in the persistent player, preroll ads will display on app launch.
In general I’m pretty skeptical of the ability to get anything deep out of these chat bots, but I think it is wrong to say that the generated commit message is worse than none. The programmer still read the generated message and OK’d it. So, it tells us something about their intent, in the sense that they thought the message summarized it sufficiently (or, they could just OK without reading it, but that’s just lying with extra steps, they weren’t trustworthy in the first place).
> The programmer still read the generated message and OK’d it.
You think? For some programmers writing commit messages is like ... i don't know because i'm not one of them... some kind of torture?. I bet the kind of person who likes this service would otherwise put in blank commit messages or at best ticket IDs.
GPT-assisted commit messages are fine if the user takes responsibility and gets consequences if they publish bad data, in proportion to the volume of bad data they publish.
Except for startups when commit messages are more like "asdf", "aoeu", "quick fix", or "demo" because some investor barged in and demanded a demo before they would wire funds.
If ChatGPT could change that to something like "disable current limits" or "disable safety checks" or whatever that might be marginally better.
A ChatGPT-generated message, pasted without editing, is purely functional transformation of the code, adding zero information. This means I could just as well run it on your diff myself, if I thought it would be useful. More than that, when I do it a year or two after you made your commit, the then-current ChatGPT will likely do a much better job at summarizing the change. So perhaps it's best to leave auto-summarization to (interested) readers, and write a commit message with some actual information instead.
If the programmer checks and OKs the message, then it still conveys information that you don’t have a year or two down the line. ChatGPT is guessing what their intent was, it could guess wrong, but if it guesses right and they validate that guess, then their intent has been summarized.
A programmer checking and OKing the message only tells you they didn't think it's bad enough to expend effort correcting it.
ChatGPT can't correctly guess what the author's intent was, because that information is not contained in the code (exception: if the code includes a comment explaining the intent).
It's not. The diff, which is the sole input to GPT-3 here, does not carry the causal context - that is, why the change was made. Nor does it carry the broader understanding necessary to summarize the what succinctly and accurately - things like high-level design terms that mostly exist on diagrams or in issue trackers, but not in the code itself. By adding those details in a commit message, the author can add extra information.
And yes, technically they could do it in comments, which would allow GPT-3 to process it. Except, I think it's highly unlikely for a coder using ChatGPT to write commit messages for them to actually author useful comments on their own. If they write any at all, they'll probably use ChatGPT (or Copilot) for that too.
You know what? You do that. I'm gonna start a SaaS business providing AI generators for code, commit messages and documentation. Models that take into account all the context to deliver more accurate, better results. All I need is a live feed from your JIRA, Confluence, Teams/Slack, Zoom, Exchange, SharePoint or whatever alternatives you use. This will guarantee you the biggest bang for your buck!
Elsewhere in this thread someone was wondering if sending a change diff to a fly-by-night third party SaaS could be leaking company IP. They were thinking too small.
But cynicism aside - giving the model access to all that contextual info would definitely increase the chances it would generate useful commit summaries. It would also increase chances it would generate much more convincing bullshit, full of just the right phrases to confuse your programmers, PMs and architects alike.
> asdf is still better than having lies sprinkled in randomly.
I think this is the core of my argument, yeah. If a _reader_ needs better than they can run GPT themselves. But the _writer_ using it is worse than useless, it's actively harmful.
This indicates commit quality. Why lose this info? If you have only time to put "aoeu" into commit message would you have time to correct ChatGPT output? ;)
Startups tend to be a "do a rush job so the business won't die, worry about fixing it later" kind of a deal. I don't envy those working on original codebase after the startup is no longer racing its own runway.
I've experienced messages barely better than this in products that were under no immediate threat, and let me tell you this: having to figure out why some changes were made, three years earlier, in a bunch of badly-described commits whose author already left for another job, with no old documentation hinting at the purpose of the changes - this is one of the few things in this job that make me want to shout someone's ear off.
Put another way, you asking GPT for stuff that it learned from Stack Overflow: good. Using it to post to Stack Overflow: bad.