Gptcommit: Never write a commit message again (with the help of GPT-3)

ketralnis · on Jan 19, 2023

I don't really ever want to read answers from GPT to questions that I didn't knowingly myself ask GPT. If GPT can write a commit message from you, don't write it at all and let me ask it that if that's what I want. It may be a positive to you to spend a few seconds less on commit messages but it's a net negative for the world for it to become polluted with vast amounts of flatly incorrect text with no knowledge as to its provenance. I'd rather have no commit message than one that I can't trust whether it's written by the same human that wrote the code or not.

Put another way, you asking GPT for stuff that it learned from Stack Overflow: good. Using it to post to Stack Overflow: bad.

tomtomtom777 · on Jan 19, 2023

For me the point of this demo is that even a good commit message is often redundant information.

As programmers we learn that adding a comment like:

   // The limit is 16

to

    const SOME_LIMIT = 16

is bad because is redundant information that serves no purpose to the reader and can easily misalign in the future.

So what's a good commit message for changing this limit? Ideally we want to describe why we've changed it but this information isn't always available so even when we're avoiding redundant comments we often use redundant commit messages like "increased SOME_LIMIT" to make browsing through history easier for others.

As we do not need to provide this information (it is already in the code), it seems like a reasonable idea for an AI to help us provide it.

xg15 · on Jan 19, 2023

I don't think the situation is comparable. The comment is redundant because typically you see the commented code right next to it, so reading the code is about as much effort as reading the comments.

In contrast, commit messages often stand alone: If you browse the history, you only see the messages, but now a large number of them; if a commit changes more than one file, the commit message has to sum up the changes from all files.

In all those contexts, a simple, high-level description of what has changed can be enormously helpful.

D13Fd · on Jan 20, 2023

> In all those contexts, a simple, high-level description of what has changed can be enormously helpful.

Sure, but you are leaving out the point of the original reply -- the GPT-written commit messages are not trustworthy. They will look convincing, but they are likely to have errors.

Hextinium · on Jan 20, 2023

I find editing text is easier than writing from blank, this would (if it works) give me a starting condition to them edit from which is helpful.

TeMPOraL · on Jan 19, 2023

> Ideally we want to describe why we've changed it but this information isn't always available

I struggle to imagine situation in which this is the case. Surely, even in the worst case of you being told to make a particular change with no explanation given, you can at least drop a "increased from 5 at a request of ${name of your boss}", or "increased from 5, see ticket #${ticket number}" in a comment, and/or a commit message.

flutas · on Jan 20, 2023

Something that's standard at the company I work for is commit messages always having the ticket number at the start, and it helps figure out why something was changed so much more than a commit message.

Ex of a recent change I saw, but anonymized a bit.

PROJ-12345: Added preview flag to video player

PROJ-12345 in Jira:

When a preview of a video is playing in the persistent player, preroll ads will display on app launch.

teddyh · on Jan 20, 2023

A more standardized format is:

  Bug: #12345

if you want to merely reference a specific bug/issue, or

  Closes: #12345

if this commit fixes the bug/closes the issue.

See https://git.wiki.kernel.org/index.php/CommitMessageConventio...

funcDropShadow · on Jan 20, 2023

But the PROJ-12345 at the beginning of the first line of the commit is very common in many entreprise projects.

throwanem · on Jan 19, 2023

If you don't know why you're making the change, you are not ready to commit the change.

RheingoldRiver · on Jan 20, 2023

Seems like it could be helpful as a starting point for a non-native English speaker though

issa · on Jan 20, 2023

usually the more important information in a comment is WHY the code does what it does.

bee_rider · on Jan 20, 2023

In general I’m pretty skeptical of the ability to get anything deep out of these chat bots, but I think it is wrong to say that the generated commit message is worse than none. The programmer still read the generated message and OK’d it. So, it tells us something about their intent, in the sense that they thought the message summarized it sufficiently (or, they could just OK without reading it, but that’s just lying with extra steps, they weren’t trustworthy in the first place).

nottorp · on Jan 20, 2023

> The programmer still read the generated message and OK’d it.

You think? For some programmers writing commit messages is like ... i don't know because i'm not one of them... some kind of torture?. I bet the kind of person who likes this service would otherwise put in blank commit messages or at best ticket IDs.

hgsgm · on Jan 20, 2023

GPT-assisted commit messages are fine if the user takes responsibility and gets consequences if they publish bad data, in proportion to the volume of bad data they publish.

dheera · on Jan 19, 2023

Except for startups when commit messages are more like "asdf", "aoeu", "quick fix", or "demo" because some investor barged in and demanded a demo before they would wire funds.

If ChatGPT could change that to something like "disable current limits" or "disable safety checks" or whatever that might be marginally better.

TeMPOraL · on Jan 19, 2023

A ChatGPT-generated message, pasted without editing, is purely functional transformation of the code, adding zero information. This means I could just as well run it on your diff myself, if I thought it would be useful. More than that, when I do it a year or two after you made your commit, the then-current ChatGPT will likely do a much better job at summarizing the change. So perhaps it's best to leave auto-summarization to (interested) readers, and write a commit message with some actual information instead.

bee_rider · on Jan 20, 2023

If the programmer checks and OKs the message, then it still conveys information that you don’t have a year or two down the line. ChatGPT is guessing what their intent was, it could guess wrong, but if it guesses right and they validate that guess, then their intent has been summarized.

TeMPOraL · on Jan 20, 2023

A programmer checking and OKing the message only tells you they didn't think it's bad enough to expend effort correcting it.

ChatGPT can't correctly guess what the author's intent was, because that information is not contained in the code (exception: if the code includes a comment explaining the intent).

dheera · on Jan 19, 2023

> purely functional transformation of the code, adding zero information

I mean, a human brain is arguably also a purely functional transformation adding zero information.

TeMPOraL · on Jan 20, 2023

It's not. The diff, which is the sole input to GPT-3 here, does not carry the causal context - that is, why the change was made. Nor does it carry the broader understanding necessary to summarize the what succinctly and accurately - things like high-level design terms that mostly exist on diagrams or in issue trackers, but not in the code itself. By adding those details in a commit message, the author can add extra information.

And yes, technically they could do it in comments, which would allow GPT-3 to process it. Except, I think it's highly unlikely for a coder using ChatGPT to write commit messages for them to actually author useful comments on their own. If they write any at all, they'll probably use ChatGPT (or Copilot) for that too.

welshwelsh · on Jan 20, 2023

We could give the model access to JIRA, confluence, meeting transcripts etc. so that it has all the same contextual information as the developers.

TeMPOraL · on Jan 20, 2023

You know what? You do that. I'm gonna start a SaaS business providing AI generators for code, commit messages and documentation. Models that take into account all the context to deliver more accurate, better results. All I need is a live feed from your JIRA, Confluence, Teams/Slack, Zoom, Exchange, SharePoint or whatever alternatives you use. This will guarantee you the biggest bang for your buck!

Elsewhere in this thread someone was wondering if sending a change diff to a fly-by-night third party SaaS could be leaking company IP. They were thinking too small.

But cynicism aside - giving the model access to all that contextual info would definitely increase the chances it would generate useful commit summaries. It would also increase chances it would generate much more convincing bullshit, full of just the right phrases to confuse your programmers, PMs and architects alike.

hgsgm · on Jan 20, 2023

Even better: let GPT write the JIRA tickets and Confluence pages, and speak for me in meetings.

dheera · on Jan 20, 2023

I guess they could feed all of the team's Slack history into GPT as well and it would then have the context?

funcDropShadow · on Jan 20, 2023

It could even continue filling Slack.

ahartmetz · on Jan 20, 2023

Though in this case, it has more and better quality context (i.e. input), like requirements or anything else that isn't in the model's training set.

eternityforest · on Jan 19, 2023

asdf is still better than having lies sprinkled in randomly.

Maybe prefixing them all with gpt: would help

ketralnis · on Jan 19, 2023

> asdf is still better than having lies sprinkled in randomly.

I think this is the core of my argument, yeah. If a _reader_ needs better than they can run GPT themselves. But the _writer_ using it is worse than useless, it's actively harmful.

llukas · on Jan 19, 2023

This indicates commit quality. Why lose this info? If you have only time to put "aoeu" into commit message would you have time to correct ChatGPT output? ;)

javier2 · on Jan 19, 2023

this is just normal every day commit messages in most startups I've seen

TeMPOraL · on Jan 20, 2023

Startups tend to be a "do a rush job so the business won't die, worry about fixing it later" kind of a deal. I don't envy those working on original codebase after the startup is no longer racing its own runway.

I've experienced messages barely better than this in products that were under no immediate threat, and let me tell you this: having to figure out why some changes were made, three years earlier, in a bunch of badly-described commits whose author already left for another job, with no old documentation hinting at the purpose of the changes - this is one of the few things in this job that make me want to shout someone's ear off.

javier2 · on Jan 22, 2023

yeah, it happens a lot in established projects as well. especially projects with very small teams.

teddyh · on Jan 20, 2023

Haaaaaaaaands

— https://xkcd.com/1296/

pachico · on Jan 19, 2023

I find commit messages have more value when they don't just repeat what you can see by looking at a diff but when they explain the reasons behind.

jart · on Jan 19, 2023

The point of a summarization model is if you have a thousand line change, it helps to have a one sentence explanation of what it is. The demo videos the author used here really don't do a good job communicating that, because the summary GPT-3 wrote for his one line commit was longer than the commit itself.

TeMPOraL · on Jan 19, 2023

Right, and even if GPT-3 could summarize the thousand-line diff in a sensible way, without introducing any falsehoods, it would still be strictly worse than the developer writing a sentence explaining what they think they've accomplished with the commit.

It's just the same thing as with comments and "self-documenting code". The code tells you what (and if written carefully, it may be even somewhat effective at it). It can't tell you why. Neither can a GPT-3 summary of it.

hgsgm · on Jan 20, 2023

I dunno, I'd like to see if GPT agrees with the author's assessment of what their code does.

funcDropShadow · on Jan 20, 2023

But that is neither the purpose of a commit message, nor is it the why a commit was made.

ape4 · on Jan 19, 2023

I was hoping GPT-3 was going to give the reasons

waynesonfire · on Jan 19, 2023

yeah and find the bugs.

BoorishBears · on Jan 19, 2023

I've had ChatGPT find some pretty esoteric bugs in ways that shocked me.

Like semi-jokingly asking it to "improve" some code thinking it'd come up with some non obvious control flow... then instead having it immediately point out a subtle bug along the lines of "the code sets flag A instead of setting flag B on <insert line>" flag B wasn't even unused, so it's not like a simple unused variable heuristic would have caught that.

smashedtoatoms · on Jan 19, 2023

Because what we need is more of the what was done, with no regard to the why. Why provide any context as to why the change was made when you can fill it with an AI description of what one could accurately tell by looking at the code? I kinda can't believe this isn't a joke. Just squash it to the emoji that best captures the sentiment! Why use the tool to enhance you and your peers lives, when you can use AI to make it pointless!

NBJack · on Jan 19, 2023

Neat concept, but this opens up a can of worms for corporate security. Pretty sure I won't get approval to submit proprietary code to a third party service just because I was too lazy to write a few lines of text. Might be helpful to open source projects?

xrd · on Jan 19, 2023

Just add fully homomorphic encryption.

I agree with you, but I'm assuming this could just send a diff and that context would be small enough to not leak.

Then again, if GPT can keep track of all the diffs...

TeMPOraL · on Jan 20, 2023

I don't think the kind of diff you'd want to use GPT-3 to summarize would also be small enough to not leak company IP.

xg15 · on Jan 19, 2023

...and ask OpenAI to reimplement the entirety of ChatGPT to work with homomorphic encryption.

UncleMeat · on Jan 20, 2023

FHE is slow as shit. Good luck running models at any reasonable pace. Somewhat Homomorphic Encryption is not useful since you've got way too many multiplies on floating point numbers.

polemic · on Jan 19, 2023

The very last thing you should do is commit a GPT-3 generated commit message for a fairly simple reason: if GPT-3 can interpret and and explain the change as written, there is no reason to commit that message. You will always be able re-run the generator at any later date, over any range of changes, to get the same or (presumably, in future) improved results.

As pointed out by other comments, the commit message should be telling you facts about the change that are not evident from the change itself. GPT-3 can't tell readers why the change happened.

0x000xca0xfe · on Jan 20, 2023

Writing commit messages (or comments in general) is like practicing vocabulary, but for your mental understanding of the current problem.

Taking a step back and thinking about what I have actually done often helps me to find misconceptions, the worst bugs of them all.

Automating this away would be like learning a foreign language by pasting book exercises into a translation app... you may get good grades, but does it help your understanding if you didn't put in the effort yourself?

chrismorgan · on Jan 20, 2023

Yep. More than a few times I’ve finished a piece of work, and in writing the commit message explaining the whys and wherefores, realised my solution was actually flawed, or that a better solution was possible, and so thrown the entire thing away and started again. I love writing commit messages.

jim-jim-jim · on Jan 20, 2023

In the early days of Covid, the web was awash with all sorts of stupid fucking designs that reimagined public space under the new normal or whatever. It was chaff that creators and readers alike knew would never be put to practical use, or even be produced in the first place. There's a good writeup about it here.

https://mcmansionhell.com/post/618938984050147328/coronagrif...

I think the same phenomenon is at play here. Everybody sharing their own silly parrot tricks: it's the least interesting topic in the world right now.

Rogach · on Jan 20, 2023

I don't want to debate the presence or absence of merits in this tool (these are extensively covered in other comments), but I want to point out that even in the demo examples 2 out of 3 commit messages are plainly incorrect:

- in Demo 1 tool wrote "Switch to colored output..." while in the diff we can see that colored output was already present;

- in Demo 3 tool wrote "Add installation options and demo link to README", while in the actuall diff we only see a link being added, no changes to installation options.

Props to the author for being honest and not cherry-picking the examples.

haney · on Jan 19, 2023

This is interesting but I’d hate to work on a project where this was used. Commits should tell me why a change happened not just what code changed.

gkfasdfasdf · on Jan 20, 2023

To everyone hating on this...I think a GPT-3 summary of a diff is a great thing to have, because it's a summary of the change and thus can be quicker to grok than picking through a diff. Also this doesn't seem to preclude a developer adding their own text to the commit (the why, etc). Finally, if the summary looks weird/incoherent it could serve as a signal to the developer that the commit needs more work.

funcDropShadow · on Jan 20, 2023

It is not about hate. If there is tool, perhaps GPT-3, that is really good a summarizing code diffs. It should be integrated in your IDE or other tooling to summarize diffs on the fly, when I need that summarization. Not when I commit a diff. Thereby, we could all profit from improvements of that tool over time, and everybody could use it in his or her own language. That is strictly better than running that tool once and integrating it hard with the source code.

SketchySeaBeast · on Jan 19, 2023

I can kind of understand getting help writing the description of a large PR. But a commit message? Whose commits are so long so often that they need the help of an AI assistant to come up with the contents?

deathanatos · on Jan 19, 2023

Heh… there are really two types of coders. Those who things commits should have a single, obvious, minimal purpose, and who will split off unrelated changes into separate commits…

… and those who tag you as a reviewer on +8,298, -1,943 commits/PRs with the commit message "JIRA-PROJ-84138".

satvikpendem · on Jan 19, 2023

> … and those who tag you as a reviewer on +8,298, -1,943 commits/PRs with the commit message "JIRA-PROJ-84138".

At my workplaces, we've told people who do this to break up their larger commit into smaller ones before reviewing. If they haven't done that initially, well, their life is going to get harder for a few days.

xg15 · on Jan 20, 2023

Say hello to a long list of smallish commits full of random, unrelated changes and with commit messages "fixed various small issues", "continued implementing <feature>", etc

deathanatos · on Jan 20, 2023

I'm good with that! It's faster, and easier, to review many smaller PRs, than one large one, IME. (Although also IME, in actual time, larger PRs get "reviewed" faster by not getting reviewed at all.)

(I'd want a better commit message than those, though. But they might just be examples for the sake of discussion.)

I tried to start getting people to follow a rule of "if it's a cosmetic/stylistic change, so long as it passes CI, +1". (Nowadays I work in a language what has a decent auto-formatter, and CI just runs & enforces that…) There's a whole slew of similar changes that fall under that umbrella, if you can have the test for it. (I.e., if I can encode my review into a program that CI runs … then yay! For PRs that meet that, if CI is happy I'm happy.)

satvikpendem · on Jan 20, 2023

Generally it's based on the feature / ticket in JIRA or whatever software you use. If the ticket is causing a commit with +8,298, -1,943, then we'd go back and break up the ticket itself into smaller tickets and then ask the coder to assign the changes to each ticket. There is no way we'd merge changes with such large file addition/deletions.

erik_seaberg · on Jan 20, 2023

You can break a car into small pieces, but you won’t learn anything from a test drive if it doesn’t run.

deathanatos · on Jan 20, 2023

Yes … it should run at every stage.

There are times new work does result in larger commits, like, a few hundred lines. But I've had some 20k delta PRs dropped on me and it's like, let's be honest, the review will be shallow, at best.

satvikpendem · on Jan 20, 2023

Cars are not like software. The software should run at the initial stage N and at all subsequent stages N+X. If it doesn't and requires some large PR to continue working, you've got a fundamental problem there.

AnimalMuppet · on Jan 19, 2023

I like it. Inflict the pain where it belongs.

TeMPOraL · on Jan 20, 2023

There is a third type of coder: one doing commits with single, obvious, minimal purpose, that still sometimes end up being +8,298, -1,943 - but with a sensible, detailed message explaining what's being done and why.

This happens in environments where it takes hours for CI to let your change pass, making small commits prohibitively expensive in terms of time and infrastructure.

(And yes, I know the answer is: make it so CI that's part of review takes minutes, not hours.)

chrismorgan · on Jan 20, 2023

And then there are the cases where the diff is a single line, but the commit message over a hundred lines because the change (or more likely its justification) is not obvious.

TeMPOraL · on Jan 20, 2023

That too.

I've never seen a 1:100 code to commit message lines ratio so far, but I've seen a few 2-3 line changes with a paragraph or two long explanations. I cherish those. Same if the explanation is in a comment. In fact, if I spot something like this, I tend to praise it publicly on the team chat.

I had one case where a single weird line added much earlier messed up a seemingly unrelated piece of code I've been developing. It took me a while to figure out that something is emitting compiler flags that, with surgical precision, prevent the very thing I was attempting - and then find it nested deep in the build configuration. At that point I wanted to strangle the person who put it there - but a paragraph of commentary attached to that line, plus some extra context in the commit message, made me change my reaction to "oh. OH. I see the point now.", and I ended up commending the author instead.

chrismorgan · on Jan 20, 2023

I’ve done 150 lines on what I think was a two-line diff before, and >50 on a one-line diff a few times. But I am known for my verbose commit messages. (After two years working on a twenty-odd-year-old commercial code base that had had a dozen or so people working on it constantly, I had around two thirds of the longest commit messages. My longest was something like 400 lines, but most of that was a list of affected class names or similar, on a diff of tens of thousands of lines from a mostly-automated refactoring.)

I’ve definitely also added multiple paragraphs of in-code comments to an otherwise-single-character change, where it’s an ongoing consideration rather than something that can reasonably be left in a commit message alone. Then my commit messages gets to be brief, directing you to read the added comment instead.

avgcorrection · on Jan 20, 2023

See for example commits by Jeff King on the Git project. :) It’s impossible to infer those messages from the diffs alone.

wprl · on Jan 21, 2023

Hey, at least they referenced the (hopefully appropriate) JIRA ticket!

darekkay · on Jan 20, 2023

With WhatTheCommit [1], I never have to come up with commit messages again. /s

I even wrote an IntelliJ IDEA plugin 9 years ago [2]. Half as a joke, half to learn about IDEA plugin development. I'm puzzled by seing so many people actually using it. Last month the HTTP link became invalid, and soon after someone opened a PR with a fix. I really hope noone actually uses those commit messages on shared repositories.

[1] https://whatthecommit.com/

[2] https://darekkay.com/blog/what-the-commit-plugin-for-intelli...

yowlingcat · on Jan 19, 2023

The worst part about GPT-3 is people using it to automate things where the entire value comes from what the human annotates rather than automates. This is an idea, which like many others involving GPT-3, which I believe will destroy more value than it creates.

zactato · on Jan 20, 2023

Did the OP use the tool to write his own commit messages?

A lot of the commit messages were typical and sort of redundant but this one stood out to me https://github.com/zurawiki/gptcommit/commit/82294555e7269e6...

"Add github token to address GH Workflow rate limits"

This is a good commit message, it describes a problem and a solution. I'd be very impressed if the GPTCommit tool wrote this and knew why the github token was being added.

avgcorrection · on Jan 19, 2023

There are tools that I wish didn’t exist and this is one of them.

FastEatSlow · on Jan 19, 2023

Perhaps this could be more useful if it could be fed information from a bug tracker, so it could use the context to create a meaningful (if inaccurate) commit message.

tjpnz · on Jan 20, 2023

If you're unable to write your own commit messages that's a strong signal to me that either your commits are too large or that you're unable to explain in simple words what you just did. While the first can be remedied I would find it hard working with someone who consistently displayed the second.

AnimalMuppet · on Jan 20, 2023

1. If automating writing commit messages significantly improves your experience as a developer, you're doing something wrong.

2. If GPT-3 can write commit messages even close to as clear as you can, you're doing something wrong.

warkanlock · on Jan 19, 2023

The peak of human society right here

micimize · on Jan 19, 2023

Comments here are acting like you can't add/edit the commit. It offers a starting point. Yes it's just-above-diff level, but it is at-least-above-diff level.

But my main though is that IDK about using this for anything closed source. Feed openai's API your codebase, one commit at a time. Even if they promise not to train on your prompt history today, ToS could change. Seems fine if you run it locally though.

xg15 · on Jan 20, 2023

Fun fact: you can probably turn this around too: Write a fictional commit history and have ChatGPT generate the actual commits for you.

joshe · on Jan 20, 2023

This is fun.

Would also be cool to generate commit messages while viewing history, it could really do a good job of orienting you. I'm imagining "human commit msg | gpt commit msg" so you can look at both. It's a little simplistic right now, kinda just describes the diff, but GPT-3.2 could rock.

rawfan · on Jan 19, 2023

At least the first line of commit messages shouldn’t describe WHAT changed but WHY the change was made.

nora-puchreiner · on Jan 20, 2023

I was wondering if there is a possibility of obtaining an offline version of the service, in order to mitigate the inherent risks associated with transmitting proprietary code to external servers, thus ensuring optimal security and confidentiality of said code?

failuser · on Jan 20, 2023

Cool, but I hope his is never used as is, just submit with some keyword and call the latest version of GPT on the diff when looking through the history later. A bad commit message is worse than no message and it can’t be easily fixed.

hooande · on Jan 19, 2023

I like writing commit messages. I find it helps me to think through and explain the change that I'm committing. personal quirk: for major commits I'll add fun ascii art, just as a treat

abi · on Jan 19, 2023

If you're looking for a Python variant of this tool: https://github.com/abi/autocommit

coding123 · on Jan 20, 2023

The repo for this isn't eating it's own dogfood.

ilikehurdles · on Jan 19, 2023

Might as well commit “I don’t remember writing that commit” because that’s going to be your every answer when someone has a question about what you did.

jupp0r · on Jan 19, 2023

This is horrible. Commit messages should contain the reason why this change has been made and not imprecise prose summaries of what the diff looks like.

imiric · on Jan 20, 2023

The comments here are acting as if the messages can't be changed. As someone else mentioned, this should be used as a starting point to summarize the change, but the reason for the change obviously can, and should, be added by a human.

This is far from horrible.

jupp0r · on Jan 20, 2023

It adds 100% of what I would point out in code reviews to be removed from the commit message. It incentivizes an anti-pattern, so yes it's horrible.

dragonwriter · on Jan 20, 2023

Be more impressed if I write the commit message and GPT writes the code than vice versa.

If I wrote the code, writing a commit message is trivial.

boardwaalk · on Jan 20, 2023

You can do approximately that with GitHub Copilot already: Write a comment and have Copilot write the function or what have you to match.

BuckyBeaver · on Jan 20, 2023

I'll write a shitload of commit messages before I'll give OpenAI my phone number.

LeicaLatte · on Jan 19, 2023

I don’t get the hate. Don’t use it all the time, but this could be useful as part of a danger report.

A readable summary for the ones who may not understand code - your developer will never write that.

tobyhinloopen · on Jan 19, 2023

Why are the demos videos?

sigmonsays · on Jan 19, 2023

this is awful.

xrd · on Jan 19, 2023

Now do this for branching strategies.

This is amazing. Humans should only need to read commit messages, never write them.