The obsession of git users with rewriting history has always puzzled me. I like ...

phs2501 · on June 19, 2015

The point of rebasing for clarity, IMHO, is to take what might be a large, unorganized commit or commits (i.e. the result of a few hours good hacking) and turning it into a coherent story of how that feature is implemented. This means splitting it into commits (which change one thing), giving them good commit messages (describing the one thing and its effects), and putting them in the right order.

Rather than hiding bugs, usually I wind up finding bugs when doing this because teasing apart the different concerns that were developed in parallel in the hacking session (while keeping your codebase compiling/tests running at every step) tends to expose codependence issues that you wouldn't find when everything's there at once.

It's basically a one-person code review. And when you're done you have a coherent story (in commits) which is perfectly suited for other people to review, rather than just a big diff (or smaller messy diffs).

It also lets me commit whenever I want to during development, even if the build is broken. This is useful for finding bugs during development as you'll have more recorded states to, i.e., find the last working state when you screw something up. And in-development commits can be more notes to myself about the current state of development rather than well-reasoned prose about the features contained.

I realize not everyone agrees with it, but I hope I've described some good reasons why I think modifying history (suitably constrained by the don't-do-it-once-you've-given-your-branch-to-the-public rule) is a good thing, not something to be shunned.

wylee · on June 19, 2015

I agree with you, but only for local commits that haven't been pushed to a shared repo.

Rewriting local history seems no different than rewriting code in your editor.

Rewriting shared history is (almost) always bad.

phs2501 · on June 19, 2015

I like "Rewriting local history seems no different than rewriting code in your editor", that's a pretty good analogy I hadn't thought of.

There are a (very) few instances where you'd want to rewrite something pushed to a shared repo. One is if there's a shared understanding that that branch will be rewritten. Some examples would include git's own "pu" and "next" branches. "pu" is rebased every time it changes, and "next" is rebased after every release. Everyone knows this and knows not to base work off these branches. There's also the occasional "brown paper bag" cleanup like some proprietary information got into the repository by mistake and all the contributors have to cooporate to get it removed. But all of these take out-of-band communication somehow.

Jacqued · on June 19, 2015

We've been fine using rebase on already pushed branches. This comes from the understanding that a feature branch belongs to one developer, ever, and that no one else is supposed to work off of it (or at their own peril).

Everyone knows that it's "my branch" and that they're absolutely not supposed to use it for anything until it's merged back into master or whatever authoritative branch.

seanp2k2 · on June 19, 2015

If you're having a person own a branch, implementing it in their fork would probably make more sense: https://www.atlassian.com/git/tutorials/comparing-workflows

We using the forking model for bigger projects with more developers, and the branching model for smaller projects. It works out very nicely.

aoeuasdf1 · on June 19, 2015

Ok, that makes sense... but then why bother pushing the branch in the first place?

yellowapple · on June 19, 2015

For me, it's because I hop between development machines, and pushing/pulling a branch is much easier than the alternative of synchronizing files manually among said machines.

Also, so that if something goes awry with my dev machine for whatever reason, at least my work is saved.

Also, to make it easier for a colleague to review my code before it gets merged into something.

Also, becuase it means I can use GitHub's PR system instead of doing it on my machine (thus providing some additional record that my code got merged in, and providing an avenue for the merge itself to be reviewed and commented on).

sopooneo · on June 19, 2015

We have a rule that you never go home at night without pushing your work, even if it's garbage. Put it in a super-short-term feature branch if needed, and push that, but don't leave it imprisoned on your machine.

goostavos · on June 20, 2015

There are people who follow this rule, and there are people that think disk failures are what happen to other people.

Few things sting as bad as loosing hours or days worth of work.

icebraining · on June 20, 2015

And there are people who have good backup systems.

zastrowm · on June 19, 2015

It allows builds off of that branch, so you can get test feedback etc. It also acts as sort of a backup or a sync if you switch machines.

davidp · on June 19, 2015

Code reviews -- you can create a PR on the pushed code, make fixes in response to the comments, rebase, and re-push.

girvo · on June 19, 2015

I work on multiple machines. Pushing my branch up even if it's busted code means I can continue work easily on other computers.

mbesto · on June 19, 2015

Immediate backup

(I hope I'm not alone in saying this...)

mikeash · on June 20, 2015

I kinda hope you are, because backup and source control really should be separate functions. Obviously your source control repository should be backed up, and pushing stuff into it acts to create a backup, but you really should have a separate backup system at work as well, to cover unpushed code as well as all the other useful info contained on your computer.

Jacqued · on June 20, 2015

I use it the same way too. I do not really see why backup should be separate from source control as there is no valuable information on my (work) computer apart from the source code, and I never spend more than a few hours without pushing.

mikeash · on June 20, 2015

Backups of your work computer would close that hours-long window between pushes.

undergrowth54 · on June 19, 2015

You are not

sbov · on June 19, 2015

Does anyone advocate rewriting shared history? Oddly I see this "exception" a lot in reply to this person but I'm not sure I ever read anywhere anyone saying rewriting shared history is a good idea.

cpitman · on June 19, 2015

I think its less people saying you should rebase shared history, and more people saying you should rebase without realizing shared history matters. Then some poor confused soul starts always rebasing before pushing/merging and they mess up their local history and do not know how to fix it.

A lot of git is "magic" to many developers, and the way that rebase works is certainly one of the features poorly understood.

talideon · on June 22, 2015

Only in extreme circumstances where something sensitive (such as credentials) or otherwise (such as other people's copyrighted assets, or .svn directories in the case of some repos that were moved from SVN to get in a hamfisted manner) was checked into the repository and needs to be removed. Those are the only reasons for rewriting shared history.

mcv · on June 19, 2015

My rule of thumb is that rewriting shared history is always, always bad. There may be situations where the proper precautions can mitigate the risk, but I've never seen a good example where it's actually a completely good idea without downsides.

ori_b · on June 19, 2015

> I agree with you, but only for local commits that haven't been pushed to a shared repo.

Yes, that's why Git doesn't allow you to push rewrites, at least not without '--force'.

geertj · on June 19, 2015

> Rewriting shared history is (almost) always bad.

Agreed. The one counterexample that I have is Github pull requests. Those are actually branches in your fork, and you do want to rewrite those when you get feedback on a pull request. That makes it easier for the owner of the repo to do the merge later.

clinta · on June 19, 2015

Why do you need to rewrite? If a pull request is not completed, you can continue to push it and the PR is updated to pull the latest commit.

sorbits · on June 19, 2015

I will get pull requests where later commits fix bugs introduced in former commits.

I generally ask people to rewrite such PRs, as I’m not going to pull known buggy commits into master, even if they are followed by fixes. That is just noise.

It might also be that some commits in the PR has changed tabs to spaces or vice versa.

hayd · on June 19, 2015

I think the point was: if you have a PR with two commits, you can squash it to a single commit and force push. This will update the PR to just have the single commit. (Similarly with a rebase.)

RickHull · on June 20, 2015

sorbits' point was in response to:

clinta > Why do you need to rewrite? If a pull request is not completed, you can continue to push it and the PR is updated to pull the latest commit.

sorbits is saying that no, you really should rewrite your PR.

You, hayd, seem to be merely reiterating sorbits' point.

pyre · on June 20, 2015

Making 'temporary' commits and rewriting local history before pushing to a shared repo has analogs in other revision control systems:

* In Subversion, people track patches using tools like quilt to manage them before actually putting them together into a commit.

* In Mercurial, people use `hg mq` which is like a more featureful version `git-stash`.

These are basically all ways to track a series of patches prior to 'committing' them into the code base shared with others.

qu4z-2 · on June 22, 2015

Speaking of `git-stash` I've always thought of `git-stash` as a less featureful version of `git-branch stash`

moron4hire · on June 19, 2015

I don't think I've ever seen anyone advocate rewriting shared history.

talideon · on June 22, 2015

I've came across reasons, but they've always been pretty marginal, such as somebody checking in sensitive credentials without realising what they were doing.

moron4hire · on June 22, 2015

I think I would like the ability to edit commit messages for typos without having to force everyone to reset --hard.

talideon · on June 23, 2015

The thing is, the commit message is part of the commit, not something separate from it. Irritating as it might be, this is good for traceability.

What I do to avoid that is work on a separate branch, rebase against master, then review the commits on my branch after getting rid of any WIP commits and shuffling them around to make more sense. Finally, I make sure the commit messages are (a) accurate and (b) have no typos. Once I'm satisfied with that, I merge.

I treat merging as a big deal, but not committing.

kiallmacinnes · on June 19, 2015

Agreed, most people hear "rewrite history" and immediately assume "public history".

Rebase is a part of code review. If someone spots a typo and a "fix typo" commit follows it up as happens for a good proportion of GitHub model projects, I cringe. This information is uttery useless to the projects history, and should be rebased as a fixup. Only once code review is done, should a commit be considered for merge. It's at this point that rewriting becomes a problem.

I think most people forgot where Git came from, git is designed from the ground up for this! When someone emails a series of patches to the kernel mailing list for review, they iterate that series of commits over and over until its ready. They don't keep adding new patches on top like the Pull Request model proposed by GitHub/GitLab etc do.

ealloc · on June 19, 2015

In my Github experience, rebasing/tidying your commits is expected before a Pull Request is merged, just like your description of Linux development. Eg, the numpy/scipy/matplotlib projects.

jez · on June 20, 2015

Unfortunately, this is not true for many repositories. GitHub's interface (i.e., the "Merge" button), encourages users to merge from the web interface, where this tidying can't happen.

hayd · on June 19, 2015

Then someone else rebases over that commit, there's a conflict and lo! the tests fail. Why? typo. It's fixed in the subsequent commit (which you can't see). Lovely.

There's something to be said for having every commit pass tests/work (or if it doesn't saying explicitly in the commit message), if anyone is ever going to step over this commit.

damm · on June 20, 2015

That's a hard one; trying to make a single commit in a pull request helps me but sometimes even then a pull request gets ignored and they want me to rebase it.

The problem is they ask /me/ to rebase it; I think they should take a little ownership in the potential rewriting of history.

qu4z-2 · on June 22, 2015

There's no potential rewriting of history before they merge your pull request, only a series of unaccepted draft commits. :)

dkubb · on June 19, 2015

Another nice side benefit is that you are able to use git bisect to find bugs more easily. If some of the commits fail the build then it becomes difficult to separate commits that actually introduce a bug from those that are just incomplete.

The team I work with has recently started making sure every commit passes the build and it's had some fantastic results in our productivity. We know every individual commit passes on it's own. If we cherry-pick something in that it's most likely going to pass; so if it fails then usually the problem is in that specific commit, not one made days or weeks ago.

twic · on June 19, 2015

You don't have to rewrite history to do this. You just have to run your tests before committing. You know, like people used to in the old days.

Indeed, i think the widespread rewriting of history that goes on in the Git world makes it more likely that there will be failing commits, because every time you rewrite, you create a sheaf of commits which have never been tested.

Now, in your case, it sounds like you have set up processes to check these commits, and that's absolutely great. Everyone should do this! But why not combine this with a non-rewriting, test-before-commit process that produces fewer broken commits in the first place?

nhaehnle · on June 19, 2015

Running test before committing locally adds a lot of friction. It often happens to me that I work on a feature in component A, and in doing so, realize that it would be great to have some additional feature in component B (or perhaps there's a bug that needs to be fixed).

As long as the components are logically separate, it's usually a good idea to make those changes in separate commits. While you can do that using selective git add, I personally often find it more convenient to just have a whole bunch of rather small "WIP" commits that you later group and squash together in a rebase.

Not least of the reason is that I like to make local commits often in general anyway, even when I know that the current state does not even compile. It's a form of backup. In that case, I really don't want to have to run tests before making commits.

And obviously, all of this only applies to my local work that will never be used by anybody else.

twic · on June 20, 2015

When you come up with the idea for a feature in component B, or a bug to fix, rather than implementing it, make a note of it, and carry on with what you were doing. Once that's done and committed, you can go back to the other thing. That way, you end up with coherent separate commits, that you can test individually as you make them, without having to rewrite history. Not only that, but you can give each commit your full attention as you work on it, rather than spreading your attention over however many things.

Again, this is the traditional way of doing things (as an aside, in pair programming, one of the roles of the navigator is to maintain these notes of what to do next, so the pair can focus on one thing at a time). Seen from this perspective, history rewriting is again a way to cover up poor, undisciplined programming practice.

nhaehnle · on June 20, 2015

It's possible that we just have different styles of working.

Still, to clarify: Not all, but some of the situations I have in mind are situation where the changes in component A cannot possibly work without the changes in component B.

So an alternative workflow could rather be: Stash all your changes made so far, then do the changes in component B, commit, and then reapply the stashed changes in component A. That's something I've tried in the past, and it can work. However, it has downsides as well. In particular, having the in-progress changes in component A around actually helps by providing the context to guide the changes in component B. So you avoid situations where, after you've continued working on component A, you realize that there's still something missing to component B after all (which may be something as silly as an incorrect const-qualifier).

It's also possible that our preferences depend on the kind of projects we're working on. What I've described is something that has turned out to work well for me on a large C++ code base, where being able to compile the work-in-progress state for both components simultaneously is very useful to catch the kind of problems like incorrect const-qualifiers I've mentioned before.

I could imagine that on a different type of project your way works just as well. For example, in a project where unit testing is applicable and development policy, so that you'd write separate tests for your changes to component B anyway, being able to co-test the work-in-progress state across components is not as important because you're already testing via unit tests.

twic · on June 21, 2015

I agree that the situation where you need the changes in B to make the changes in A is both genuine and annoying!

I have often taken the stash A - change B - commit B - pop A - finish A route. If you know what changes to B you need, it's fine, but you're right, the changes to A can be useful context.

In that case, you can make the changes to B with the changes to A still around, then stash A, run the tests, commit, pop A, and continue. Then you can have the best of both worlds, and you still don't need to edit history.

If you just can't make the changes to B without the changes to A, then they probably belong in a single commit, and you've just identified a possible coupling that needs refactoring as a bonus.

dkubb · on June 19, 2015

Yeah, obviously we do that (well maybe not so obvious to some, but I never push unless the tests pass). We sometimes perform lots of other things like static analysis that get in the way of a rapid feedback loop. We also run mutation testing, which can sometimes take several hours for the whole codebase -- although we don't have this run on every commit, just ones that we merge into a specific branch.

The problem I have with non-linear commit history is that I find it impossible to keep all the paths straight in my head when I am trying to understand a series of changes. Maybe you can do that, and I think that's awesome, but I like to see a master branch and then smaller feature branches that break off and then combine back with master.

CrystalGamma · on June 19, 2015

A tool that does not naively sort the commits by date but groups linear parts of history together should allow for better overview.

comex · on June 19, 2015

Maybe, but testing does not prevent all bugs and what happens once bisecting is needed still needs to be considered.

pacala · on June 19, 2015

> The point of rebasing for clarity, IMHO, is to take what might be a large, unorganized commit or commits (i.e. the result of a few hours good hacking) and turning it into a coherent story of how that feature is implemented. This means splitting it into commits (which change one thing), giving them good commit messages (describing the one thing and its effects), and putting them in the right order.

To my understanding, Gerrit does grouped commits as part of the flow. Even better, groups all review-triggered commits under the same master commit, with the nice, extensive description that one carved for the PR. It's regrettable that GitHub popularized fork/pull request model instead.

https://www.gerritcodereview.com/

scelerat · on June 19, 2015

> The point of rebasing for clarity, IMHO, is to take what might be a large, unorganized commit or commits (i.e. the result of a few hours good hacking) and turning it into a coherent story of how that feature is implemented.

Isn't this the same rationalization that drives Git Flow's feature branches and merging via --no-ff ? You can see the messy real work in the feature branch, but it gets merged to the main branch as one clean commit.

pbh101 · on June 19, 2015

Once the merge commit occurs, the 'messy real work' is now part of the main branch's history just as much as the rest of the commits, as they are ancestors of that merge commit.

avereveard · on June 19, 2015

same here. it is much more clear to me to reapply my commits, as long as I constrain myself to clear, coherent and atomic commits.

replaying changes is much more comfortable to me, especially when I have them in shot term memory, surely easier than merging other people stuff within your files

my average feature is around 7-10 commits, all replayed on latest commit on the branch. it forces me to catch up with other people work on shared areas and gives me quite some more confidence that merge isn't messing up with problematic files.

ravishi · on June 19, 2015

Precisely.

ams6110 · on June 19, 2015

Disclosure up front, I don't really use git myself. I have tried it and found it to be too confusing. I liked svn and these days use hg. I also tend to work on mostly solo and small projects.

However in my observation I have found that more than any other revision control system I have used, the person ultimately responsible for the code spends far more time cleaning up history and recovering from developer mistakes on projects using git than any I can recall, and that goes back to CVS and Visual Source Safe, also including svn and hg.

I know a lot of people use git and love it so I'm prepared to accept that they're all smarter than I am. But IMHO, the version control system should be incidental to my work. It should not demand any significant fraction of my brainpower: that should be devoted to the code I'm working on. If I have to stop and THINK about the VCS every time I use it, or if it gives me some obscure "PC LOAD LETTER" type of response (which seems to happen to me when I use git) then it is a net negative. If I need to have a flowchart on my wall or keep some concept of a digraph in the front of my thinking or use a cheat sheet to work with the VCS, then it's just one more thing that gets in my way.

I think git probably has a place on very large codebases, with very distributed developers. For the typical case of a few developers who all work in the same office, I think in most cases it's overkill and people would be more productive using something simpler.

Mithaldu · on June 19, 2015

> If I have to stop and THINK about the VCS every time I use it, or if it gives me some obscure "PC LOAD LETTER" type of response

I'm sorry, there is no kind way to say this without spending too much time i don't have.

You're making the same kind of argument i am hearing from older people in my family about newer hardware (tvs, phones, etc.). You see an initial learning curve and falsely assume that this curve will never flatten out and give way to easy and intuitive access to power.

stickfigure · on June 19, 2015

I've been using git for years and consider myself a fairly sophisticated git user, with a reasonably solid conceptual understanding of what goes on under the covers. I've even performed significant surgery on the jgit codebase (converting it to SHA256 for a security-related project - what a mess).

And yet I don't feel that the learning curve flattens out. I still end up getting wedged into states that send me scrambling for stackoverflow.

Git is incredibly powerful, which is why I use it. But the PC LOAD LETTER comment resonates strongly with me. We can embrace a tool while also acknowledging its faults.

recursive · on June 20, 2015

I've spent more time learning git than I have spent learning all other VCS combined, of which there have been at least a few in my history. My mastery of git is significantly less than that of any other VCS I've used. Less powerful VCS are easier to use, and that can be a feature.

ams6110 · on June 19, 2015

spending too much time i don't have

And yet here you are commenting on HN.

Well, I am older, actually. Maybe it's just part of what happens. I still miss my flip phone too. So much simpler....

Mithaldu · on June 19, 2015

> And yet here you are commenting on HN.

In other words: I was close to leaving my upvote and walking away, but decided to not leave you wondering, since you DID spend some effort and thought in your post.

> Well, I am older, actually. Maybe it's just part of what happens. I still miss my flip phone too. So much simpler....

It's actually what happens. I'm feeling the same way about various things as i'm getting older. I can't be arsed to figure out what Docker is, for example. Ain't nobody got time for that. Otoh, i do realize that's just me and it's probably a great thing that i hope the sysadmins i work with will know to pick up and make use of its potential.

barrkel · on June 19, 2015

FWIW, I've spent far more time thinking about svn than I ever spent thinking about git.

Specifically, porting changes between multiple branches in svn was a nightmare. E.g. if you have three different branches (two releases and a develop), and you need to make the same bugfix on all of them - extremely unpleasant. I ended up writing my own diff/patch management system to keep track of bug fix patches, so that I could reapply them at will on other branches.

Git instantly made sense to me. It incorporated how I already thought about repositories and diffs. The DAG structure made sense. Merging made sense; rebasing made sense; everything made sense, almost instantly.

Joeri · on June 19, 2015

My two cents is that i haven't found it difficult to merge between branches myself. I'll open up a diff view of the commit(s) i want to merge, and then merge them branch-to-branch and file-to-file using a two-way merge tool, using the diff as a guide.

adekok · on June 19, 2015

> Disclosure up front, I don't really use git myself. I have tried it and found it to be too confusing. I liked svn

Oh dear, that's bad.

> and these days use hg.

That's better.

> I also tend to work on mostly solo and small projects.

Ah. You probably don't need git then.

I've used a number of systems (RCS, CVS, SCCS, svn, monotone, hg, git). Of them all, hg was the simplest to use. Git was the most powerful.

Everything else listed above is terrible for multiple developers.

But solo developers? "tar" is a reasonable system for small projects. By that standard, SVN is fine, too.

Joky · on June 19, 2015

There is difference between rewriting a "published" history from your local repo. I am heavily relying the ability to rewrite history before pushing. I hate seeing people pushing a series of commits (in a single push I mean) where the two first ones introduces a big mess and the subsequents are tentative to fixup the mess.

ascendantlogic · on June 19, 2015

This. So much this. I hate looking through history and seeing crap like "lol forgot semicolon". Rebasing when you're still in your feature branch before the code hitting master to make your commits succinct, readable and above all not contain known broken code is a must.

mikeash · on June 19, 2015

Why are you committing code you haven't even tried to build? In the scenario you're presenting, the problem is that somebody even needed a "lol forgot semicolon" commit in the first place. Stop doing that. We all make mistakes and this will come up sometimes, but if it's happening so often that you need to rewrite your VCS history to stop from annoying other people, something is wrong.

jasode · on June 19, 2015

>Why are you committing code you haven't even tried to build?

Because a DVCS tool like Git makes commits much less costly than older tools such as CVS or SVN. The dynamics (both social & personal) for commits are different.

My guess is that you understand Git commands but you're using the SVN/CVS mental model of treating commits as "sacred" markers. If someone commits in those older centralized systems, they could potentially break the build and stop the team's productivity. This leads to strange social dynamics such as programmers "hoarding" their accumulation of code changes over days/weeks and then they later end up in "merge hell".

Because Git "commits" have a private-then-public phase, the programmer does not have to be burdened with affecting others' productivity with their (sometimes spurious) commits. They can have twitchy trigger fingers with repeated "git commit". The git commits can be treated as a personal redundant backup of Ctrl+S (or vi ":w"). (Or as others stated, the git commits and private history become an extension of their text editors.) They don't have to hoard their code changes. Because of the different dynamics, they don't necessarily have an automated Continuous-Integration complete rebuild of the entire project triggered with every commit. To outsiders however, many of these commits are just "noise" and don't rise to the same semantic importance that we associated with CVS/SVN type of commits.

In this sense, "rebase rewriting private history" does not mean faking accounting numbers like "Enron fraud" and throwing auditors off track, but instead, it's more like "hit backspace key or Ctrl+Z and type the intended characters."

In CVS/SVN, the "commits" are a Really Big Deal.

However, in Git, the "commits" are Not a big deal and closer in concept to a redundant "Ctrl+S". It shifts the Really Big Deal action to the act of "applying changes or merges" (e.g. "patches" is how Linus Torvald's often describes it.)

mikeash · on June 19, 2015

I wouldn't go so far as to say that they're sacred, but I do think you're right that a disagreement over their relative importance is probably at the core of this.

However, I think the stuff about breaking the build is way off. If one were really fearful of any commit breaking the build, wouldn't one embrace rewriting history? You'd try to avoid making a breaking commit in the first place, but if you're fearful of breaking builds, then once you did make such a mistake, the ability to go back and rewrite it would surely look pretty good.

One of the big advantages of git as I see it is that you don't have to be fearful about bad commits. You made a commit that broke the build? Well, try not to do that, but as long as you don't push it, it's not a big deal. Fix it (in a new commit!) and you'll push both of them together. History is preserved, nobody's build actually broke, everybody's happy.

jasode · on June 19, 2015

>, but if you're fearful of breaking builds, then once you did make such a mistake, the ability to go back and rewrite it would surely look pretty good.

But I was trying to emphasize that Git's "mental model" eases the burden breaking the build. If everyone buys into the concept that "git commits" are just another lightweight form of "Ctrl+S", we would expect for programmers' private branches to sometimes have broken builds. That's the nature of real-world work such as refactoring or experimental changes. There's no social penalty or stigma for broken builds in private repos. Therefore, if a programmer rewrites history to hide broken builds, it's not because of ego or image-consciousness but because of consideration for others to read a comprehensible story of the changes.

You made a commit that broke the build? Well, try not to do that, but as long as you don't push it, it's not a big deal. Fix it (in a new commit!) and you'll push both of them together. History is preserved, nobody's build actually broke, everybody's happy.

Not everybody's happy. If we conceptually treat git commits as a 2nd form of "ctrl+s", we don't want to see both commits. Instead, clean up your private history, then craft/squash/edit your commits into a logical story, then make sure your public history has a clean build, and then apply those commits to the public branch. That's the way Linus Torvalds likes it for Linux patches and many agree with him. We do want some history to be preserved but not all of it.

mikeash · on June 19, 2015

When you say it's another form of ^S, how often are we talking here? I reflexively ^S every couple of words, are you literally talking about committing every couple of words? Every few lines? Less? What's the purpose committing more often than logical chunks of code which can be considered in some sense "done"?

comex · on June 19, 2015

This is somewhat different from the parent's view, but personally, I try to turn the list of commits in a given PR into a readable, reviewable "story" of the general steps that need to taken to implement a feature or fix a bug. (This starts when first writing it, because splitting up changes after the fact is a nightmare.) However, I do not want to limit myself to finishing and polishing one step before proceeding to the next. For one thing, my intuition might turn out to be wrong and the overall approach I'm aiming for might not be a good idea at all, something which I might only figure out when trying to implement the final feature/fix on top of the foundations. Or it might be a good idea overall, but I might end up realizing later that, say, out of the code I was fixing up in a previous commit, a lot of it is going to be removed by a later step in the refactoring anyway, so I should probably merge those steps or otherwise shuffle up the order. For another, I will probably just end up making mistakes, some of which I'll notice myself and some of which may be noticed in code review; while the "story" is primarily for code review, it is also useful for bisecting, so even changes found in review are good to integrate into the story.

As a result, when working on the project I'm thinking of, I use git rebase -i constantly, as if each commit were a separate file and I were switching between them in my editor. However, I don't actually like that old versions of my code are being thrown away (aside from reflog); I'd prefer if Git had two axes of history, one 'logical' and one 'real' (even if that gives people who already don't like branchy histories nightmares). I hear that Mercurial has something like this called "changeset evolution", but I haven't tried it; wish someone would make a Git port.

edmccard · on June 20, 2015

>I reflexively ^S every couple of words...

Why not just decrease the autosave interval in your editor :)

>What's the purpose committing more often than logical chunks of code which can be considered in some sense "done"?

There are different degrees of "doneness". For example, (1) code that isn't finished but you don't want to lose it if the power goes out, (2) code that you're not sure if you're going to keep, but you'd like to be able to refer back to it even if you later decide to change it, (3) code that completely accomplishes its purpose in a logical and coherent manner.

I use "Ctrl-S" for (1), "git commit" to a local branch for (2), and "git rebase/git push" for (3). Maybe I'm just a sloppy programmer, but my workflow often involves writing some code, making certain changes, then realizing that what I really need is the previous version but with different changes. So for me, frequent commits on a local branch have replaced frequent saves under different filenames (foo.c, foo_old.c, foo_tried_it_this_way.c)

mikeash · on June 20, 2015

My ^S reflex is almost 30 years old. It costs nothing, and occasionally saves me, so I have no reason to fight it. Autosave is great, but every so often you'll hit a situation where it turns out that it's not firing (misconfiguration or something) and then you're doomed. Belt and suspenders is best.

As for the rest, that's interesting stuff to ponder.

joshuacc · on June 19, 2015

Can't speak for the GP, but I often commit my changes every 3-4 minutes with messages like "Tweaked the padding." Then when my work is in a reasonable state to be viewed by someone else, I'll turn those 5-6 local commits into one coherent "feature commit" like "Redesigned the page header according to new brand guidelines."

ascendantlogic · on June 21, 2015

Because people make mistakes? Not all test suites are 100% perfect so bugs go missed? That's the point of a pull request and code review. I get that mistakes happen but we don't need a record of your mistake and subsequent fix in master (unless it's already in master. Then that history is sacred). Just squash the appropriate commits (--fixup is my favoritist thing ever) before merging to master and send the right code to public history the first time.

nevi-me · on June 19, 2015

Can't reply to mikeash below, but I also have a comment. I've burnt myself a few times where I committed something and pushed to my remote repo, only to realise that I shouldn't have.

What I've taken from my errors is that I no longer push single commits until I'm at least done with what I'm doing (I use GitFlow btw).

It's easier for such things to happen in languages where you don't need to build your project (looking at JavaScript). Sometimes it's pushing a fix, only to realise that it introduces a regression somewhere. I know that testing takes care of most of this, but not everything can have tests written for. I'm a single developer on my 'hobby' start-up, working on over 4 separate components, I unfortunately can't write tests for all of them at this stage.

mikeash · on June 19, 2015

Even in a language like JavaScript, you're at least running your new code before you commit, surely.

As for a fix which introduces a regression somewhere else, that seems like exactly the sort of history you'd want to capture in source control. "Fixed X." "The fix for X broke Y, which is now fixed." This is much more informative than a single "Fixed X." which only includes the final state. The fact that a fix for X could break Y is valuable information!

nevi-me · on June 19, 2015

Yes, I run it, but if out of the possible 5'000 combinations that I go through when searching (https://rwt.to, essentially an A-B public transit planner, https://movinggauteng.co.za - data behind the planner) one of them breaks, it becomes difficult to spot errors until a few days later at times.

I could write something that does searches for as many combinations as possible, but I'm at the point where the cost of getting an error is better than spending a day where I can't work on my code because tests are running (the data changes regularly). That day's often a weekend where I've got a small window of time to work on my hobby.

On your last point, I often end up being detailed on my commits where I can fiddle with the history before pushing to remote, so I still end up capturing what happened in SC.

I'd really love a suggestion on how I could get around this, it would help me improve (I'm an accountant by profession, but do some SAS, R, Python, JS etc. as part of my ever-changing job).

mikeash · on June 19, 2015

I don't see the problem with making a change, breaking something that's not practical to immediately test, committing that change, noticing the breakage a few days later, and committing a fix. No need to rewrite history, just have "fixed Y which broke when I did X" later on.

jhummel · on June 19, 2015

In JavaScript you have to contend with a dozen different run environments. Maybe you realized your fix actually broke the feature in IE8 because you left a comma at the end of an array. It's quite common to have your fix break something in a very specific environment.

mikeash · on June 19, 2015

That's fine, but then I don't understand what the problem is with having that IE8 fix which you didn't make until sometime later being a separate commit.

jkyle · on June 19, 2015

Say I create a feature branch, this is what a day's work might look like.

    839a882 Fix bad code formatting [James Kyle]
    6583660 Updated plugin paths for publish env [James Kyle]
    847b8f3 First stab at a mobile friendly style. [James Kyle]
    a70d3f7 Added new articles, updated a couple. [James Kyle]
    b743ec3 format changes on article [James Kyle]
    68231e7 Some udpates, added an article [James Kyle]
    2a92c5e Added plugins to publish conf. [James Kyle]
    6dec1e1 Added share_post plugin support. [James Kyle]
    070bbd0 Added pep8, pylint, and nose w/ xunit article [James Kyle]
    eb8dbcc Corrected spelling mistake [James Kyle]
    0b89761 Minor article update [James Kyle]
    677f635 Added TLS Docker Remote API article [James Kyle]
    d8e94fd Fixed more bad code formatting in nose [James Kyle]
    f06dc2d Syntax error for code in nose. [James Kyle]
    606ac2b Removed stupid refactor for testing code. [James Kyle]

This might be a very short one. If the work goes on for a couple of days, could be dozens of commits like this.

In the end, it'd be a veritable puzzle what I was trying to send upstream. Also, the merger has to troll through multiple commits and history. It's plain annoying.

So you rebase and send them something like this:

    947d3e7 Implemented mobile friendly style. [James Kyle]

And if they want more, they can see the full log with a bullet list:

    947d3e7 Implemented mobile friendly style.

    - Added plugins x, y, 
    - Implemented nose tests to account for new feature

Rebasing is about taking a collection of discombobulated stream of thought work flow and condensing it into a single commit with an accurate, descriptive log entry.

Makes everyone's life easier.

edit

It's also very nice to take out frustration generated commits like "fuck fuck fuck fuck fuck!!!" before committing upstream to your company's public repository. ;)

mikeash · on June 19, 2015

Doesn't the merge commit for a branch like that serve the same purpose as your rebased commit, but without destroying the underlying history?

soyiuz · on June 19, 2015

I agree with mkeash's comment. Isn't this also the whole point of the staging area? "Stage multiple fixes into a single coherent commit," is the underlying model. Instead, in the above example there are many granular commits, with rebasing used to then clump them into a logical grouping.

This would indicate to me that you are committing too often or not using the staging area properly.

jh3 · on June 19, 2015

Committing often allows you to remember the small changes you made throughout the feature. If you let a file sit in the staging area for hours, days, weeks, you will most likely have a hard time remembering why you made all the changes.

Is there a way to do this with the staging area? o.O

kansface · on June 19, 2015

Merge commits, particularly those that merged master multiple times effectively destroy history (by preserving it). For that matter, many projects maintain all commits to master should work! Unless you advocate only committing entirely working states (unlikely for large features), you'd have to rebase.

mikeash · on June 19, 2015

Can you explain what you mean by effectively destroying history by preserving it? That doesn't make any sense to me. And I also don't understand the link between merge commits and a failure to ensure that all commits to master should work. If you make changes in a branch, get everything up and running there, then merge to master, does that not ensure that everything on master works?

phs2501 · on June 19, 2015

I agree with getting rid of noise and adding signal (see my sibling post) but banging a whole feature into a single commit is going way too far IMHO. "Added share_post plugin support" sounds like something that should be in a permanent commit to me. "format changes on article" probably not so much (assuming you created that article in the same feature branch).

Joeri · on June 19, 2015

I share an svn codebase with a few dozen developers, and while we don't have rebase, the history remains readable. There's a few guidelines that enable this: (1) all work must happen in the context of a jira issue or story, (2) the commit message starts with the jira issue id and its description, and only after that any stream of consciousness remarks, and (3) syntax errors will cause the commit to bounce and failing tests from the CI build will get you frowned upon. The history will usually reveal a few commits for a feature, spread half a day or a day apart. We rely on the local history feature of phpstorm to be able to backtrack to an earlier version (that and good old copy-pasting of the working copy before you start an experiment)

antihero · on June 19, 2015

Is there a handy macro/script type thing that simplifies squashing a release branch and using the commit messages as bullet points (with the ability to edit out crap)?

jkyle · on June 25, 2015

The built in interactive rebase does just that.

jrochkind1 · on June 19, 2015

I tend to agree. One exception I think is rebase on a feature branch. If you rebase a feature branch onto master before merging it into master, I think you can get a cleaner history while achieving the linear history the OP wants -- and in this isolated case, I think you aren't losing any useful context by making it seem the feature commits were all done right before merge into master.

Maybe. I'm not actually sure, to be honest what's a good idea with git history, this included. Feedback welcome.

Mithaldu · on June 19, 2015

People who love rebasing and linear history tend to see feature branches, even if pushed to a public repository, as private to their creator and maintainer and fair game for any sort of rebase. In fact, we do consider rebasing of feature branches mandatory.

on June 19, 2015

[deleted]

cletusw · on June 19, 2015

I've always thought there had to be a way to solve the shared feature branch + rebasing problem! I'll have to try this out!

kazinator · on June 19, 2015

The only thing is, while it is easy from the downstream side, it's a little more tricky to prepare the new branch.

One thing you can do is actually do the regular rewriting rebase, install the result under the new name, and then throw the rewrite away.

Rebase our-topic.0 to its default upstream, thereby locally rewriting it:

   $ git rebase

(Precondition: no local commits in our-topic.0: it is synchronized with origin/our-topic.0, so they point to the same commit.)

Now, assign the resulting commit to the newly defined variable our-topic.1:

   $ git branch -t our-topic.1

Now, throw away the local modification to our-topic.0. We don't have to appeal to the reflog, because our-topic.0 is following a remote branch which we can reference:

   $ git reset --hard origin/our-topic.0

(Remember the precondition before all this: origin/our-topic.0 and our-topic.0 pointed to the same commit. Now they do again!)

Finally, push our-topic.1 for others to see:

   $ git push origin our-topic.1

npizzolato · on June 19, 2015

Seems like you could simplify this quite a bit by just creating our-topic.1 before rebasing. Given our-topic.0 == origin/our-topic.0, and our are currently at our-topic.0

    $ git checkout -b our-topic.1

    $ git rebase (-i) master

    $ git push origin our-topic.1

No need to modify and reset our-topic.0.

Mithaldu · on June 19, 2015

Yes, thank you for explaining this so well. I commented on that elsewhere, but didn't do it nearly as well as you did.

jrochkind1 · on June 19, 2015

The counter argument though is when your feature branch doesn't only have _one_ creator/maintainer. Mine often don't, especially on open source projects, two or three people can be working collaboratively, or others that aren't the lead on the feature can come in to make a helpful commit here or there.

And when one person rebases the feature branch it wreaks havoc for collaborators on the feature branch.

Which is why I limit my "rebasing is okay" on a feature branch to only _right before_ it's merged into master and then deleted. It still doesn't get rid of all the problems, but it gets rid of most of them.

Mithaldu · on June 19, 2015

If you have a handful of people, you simply communicate with them, check that there's a good reason to rebase and that you're not creating unnecessary burden and do it when everyone is happy.

When you have more than a handful of people, then your feature branch is not a feature branch, but a project, which should have feature branches of its own.

Scale, dynamic adaption to it and situational awareness are a requirement in team work. :)

kazinator · on June 19, 2015

Bingo! Recently I've been working on resolving a bug with a small group of coworkers. We created a repo in which we have been rewriting public branches all the time. You just send an e-mail. Everyone just has to know how to migrate their local changes.

   $ git fetch # the rewritten world
   $ git checkout
   Your branch and 'origin/foobar' have diverged,
   and have 13 and 17 different commits each, respectively.
   (use "git pull" to merge the remote branch into yours)

Now I happen to know that only 3 out of the 13 divergent commits on foobar are my local commits. I rebase my local foobar branch to the upstream one, migrating just those 3, and ditching the remaining ten:

   $ git rebase HEAD~3 origin/foobar

Easy.

This is all just test code people are trying in investigating the bug. Any permanent fixes arising are properly cleaned up, commented, and submitted via Gerrit to a whole other repo where they are cleanly cherry picked to a linear trunk which is never rewritten.

jrochkind1 · on June 19, 2015

> If you have a handful of people, you simply communicate with them, check that there's a good reason to rebase and that you're not creating unnecessary burden and do it when everyone is happy.

Why is that communications overhead worth it, vs simply not rebasing? (or at least not until right before you merge into master and delete the release branch)

I think the communications overhead can be significant, especially if the handful (even just 2 or 3) collaborators are at different locations, organizations, timezones, etc.

I'd rather just not have to think about it, not to have to deal with that communications overhead, and not rebase. What do you get from interim rebasing, anyway, especially if you are still wiling to do a final rebase before merge?

Mithaldu · on June 19, 2015

> Why is that communications overhead worth it

Because it pays dividends in the long run.

If a project looks like this: https://dl.dropboxusercontent.com/u/10190786/Screenshot%2020...

Then hunting bugs becomes very difficult, because you can't simply see behavior change from one commit, it might change in any merge because the behavior of one commit in one branch disagrees with the behavior of another commit in a different branch. Even worse, the behavior might be created from a badly-done conflict resolution in the merge commit, which is REALLY hard to see.

I envy you if you have not yet experienced this pain, but i assure you it is a real problem; and that you're merely exchanging effort now with effort later.

sytse · on June 19, 2015

If it is pushed other people can cherry-pick, CI generates results and other people might push commits on the same branch (when using the same repo). We think pushing indicates you want to work in public. We even made a Work In Progress (WIP) function for it in GitLab http://doc.gitlab.com/ce/workflow/wip_merge_requests.html

Mithaldu · on June 19, 2015

It's a social question. To be honest, for the vast majority of people i work with (and i've worked with a few: https://github.com/wchristian?tab=repositories ) it would be considered very strange to see feature branches as public and stone-written history. There's a common understanding that feature branches are feature branches because the creator wishes to avoid writing things in stone.

That said, if a team decides on that kind of convention, marking things as WIP is a neat feature. I've seen other people do that simply by creating feature branches as "<username>/feature_branch".

wylee · on June 19, 2015

> The obsession of git users...

That seems overly broad. It seems to me that most people who use git agree that public history shouldn't be rewritten, especially on master.

> The whole point of history is to have a record of what happened.

On the other hand, a bunch of "Derp" or "Whoops" type commits aren't very useful. It's definitely beneficial to clean that sort of stuff up by rewriting local history before pushing.

mikeash · on June 19, 2015

I'm talking about both public and private history.

It's far more beneficial to just not make commits like "Derp" or "Whoops" in the first place. Think about your commits and your commit messages as you make them. No, you won't get it right all the time. And that's OK; nobody is perfect, and your history can reflect that you're not perfect. But if you're editing your commit history to fix idiotic commit messages, you're doing it all wrong.

slavik81 · on June 19, 2015

One of the things I like about git is that I can make bad commits and fix them later. If I'm working on one feature and I'm interrupted by another task, I can commit "wip - blah", then check out a different branch and work on that. When I go back, I pick up exactly where I left off, and amend the half-finished commit into something that actually makes sense before pushing it out to the rest of the team.

In the past, I never made those sorts of commits, because I used VCSs in which you couldn't. Instead, I avoided committing by checking out a separate workspace for the new work. That's a lot slower, though, and it's easier to lose uncommitted changes. Committing incomplete, broken work allows you to leverage your VCS to manage even your unfinished code.

bksta · on June 19, 2015

"git stash" was created for this purpose

slavik81 · on June 20, 2015

git stash is handy. I tend to forget about stashed code, though, so I only stash stuff if I know I will pop it soon.

grayclhn · on June 19, 2015

The "Derp" commits are the ones made immediately after a mistake. No one is editing their commit history just to fix these idiotic messages, they're doing it to fix the idiotic mistake that precedes the idiotic message. Yes, no one is perfect, but leaving the previous (wrong) commit alone has real consequences: it prevents people from easily cherry-picking the commit, using bisect to debug later, doing an accurate code review, etc.

edit: I realize that there are probably a few exceptions to my "no one" IRL. But I don't think anyone here would defend that practice.

mikeash · on June 19, 2015

Well then, try to structure things so you're catching those idiotic mistakes before you commit them.

The consequences you describe seem extremely mild. Cherry-picking now requires two cherry-picks instead of one, big deal. Git bisect has a "skip" command that solves this nicely. And I don't see how code review is at all impacted by this, unless you're using terrible tools that won't show a full summary of changes all at once.

grayclhn · on June 19, 2015

Or... rewrite history. :) All of these problems are trivial when the commits are right next to each other; they're less trivial when they're separated by other unrelated commits.

"Don't make idiotic mistakes" isn't really advice that anyone can follow.

mikeash · on June 19, 2015

I think it depends on what kind of idiotic mistakes we're talking about. Stuff like forgetting a semicolon is completely avoidable with a disciplined approach of, "Always build before you commit." Other kinds aren't so nicely avoidable, but then I think the record should show them anyway.

grayclhn · on June 19, 2015

Fair enough. I agree that rewriting history shouldn't replace other good practices. But I don't really see the benefit of having an exact historical record of all the mistakes made when coding. What does it get you?

mikeash · on June 19, 2015

It's not so much what the true historical record gets you, but what you potentially lose with the fake one. Do the commits in your edited history actually work? If you go back to see why a given change was made, will you get an accurate picture of the state of the rest of the code at that time?

grayclhn · on June 19, 2015

But one of the points of editing the history is to make sure that both of those are more likely to be yes -- it's to make the history easier to review historically than it would be unedited. (IMO, obviously).

And this is probably partly why there's what you originally called an "obsession" with rewriting history: retroactively rewriting something 5 months old is probably going to be a disaster. But rewriting 1 day's worth of commits to better express why a given change was made, to give a more accurate picture of the state of the rest of the code, and to make things in general easier for people to read in the future is pretty trivial. So why not do it?

mikeash · on June 20, 2015

Are you building and testing your edited commits as you make them? If so, that seems fair, but a lot of work. If not, I don't see how it increases the chances of good results.

grayclhn · on June 20, 2015

All of the extra building and testing can be automated, so the extra work just becomes a matter of reorganizing the work to make logically cohesive commits and it's more work in the same sense that writing good comments is more work. Whether building and testing each commit is done as often as it should be....

I would bet that many people eyeball it, build and test the end result, and claim that that's good enough. Since many of these people probably edited out a commit with a message like "lol typo broke the build" that might be an overly optimistic attitude ;)

In any case, I don't see how it decreases the chances of good results. You already dismissed my suggestion that it's nice to have each commit build and pass tests, so it's a bit strange to start worrying about it now.

mikeash · on June 20, 2015

When did I dismiss the idea that all commits should build and pass tests? That's certainly what I aim for.

sgtpepper · on June 19, 2015

to be fair, if it's truly a "derp" commit you're better off using "git commit --amend --no-edit", which is technically rewriting history but in one stroke :)

swsieber · on June 19, 2015

I usually use rewriting history to package the code I've worked on into logical commits that should be stable at each point when applied in order. That way, it's very easy to reverse a commit or cherry pick specific functionality into the production branch early (ie the main branch isn't the stable one).

Would I like to get away from that and do it from the get-go? Oh yes, it'd be great. But I'm not there yet and so re-writing history is nice. And doing so forces me to think about the code I've written and where the boundaries of the changes I've made are. Granted, I haven't done it on very long lived feature branches (or big ones) - that may be where most of the penalties are manifest.

saidajigumi · on June 19, 2015

> It always feels to me like people just being image-conscious.

Every author is "image-conscious" because they want to present their thoughts clearly to the world. That's where your rather substantial misconceptions about the application and utility of rebasing come from. This isn't about rewriting published history, which is rightly and nearly universally considered A Bad Idea(tm) in the git world. The recommendations around rebasing are essentially identical to authors editing their text before publication. Note "before". Before {an article, some code} is published, edit, rewrite, cleanup all you want. After it's published, an explicit annotation is the best practice. For an author, perhaps an "Updated" note in an article or a printing number in a book. For a developer, add a new commit recording the change.

For my part, I use rebasing extensively and lightly before I publish code. By "extensively" I mean, I just don't hesitate to edit for clarity. This is the same as I'd do in authoring a post or email. By "lightly", I mean that I don't waste time doing radical history surgery but I regularly do things like squash a commit into an earlier logical parent commit. E.g. I started a refactor, then a little while later found some more instances of the same change. Often, this is just amending the HEAD commit, but occasionally I need to go back a short ways on my working branch.

This also fluidly extends to use of git's index and the stash for separating out logical commits from what's in the working copy. A typical example:

1. git add <files for a logical change>

2. git stash -k # put everything not added into the stash

3. # run tests

4. git commit

5. git stash pop

Once you're used to the above workflow, an understanding of git's commit amending and rebasing tools extends this authoring capability into recent history. This is wonderful because it takes pressure off of committing, meaning that git history becomes a powerful first-class, editable history/undo stack.

nickbauman · on June 19, 2015

Remember Git was born in Linux. And in Linux, a commit is a political statement. Your need to be succinct (your commit must stand along with 2000 commits per day) and emphasize the "obvious" brilliance of what you're doing to overcome noise overrides the need for recording all the thought processes along the way.

In most organizations, we don't have anywhere near that number of participants and we don't want charismatic developers, we want something that works right now and we're confident that changing it is not merely a possible outcome but very very likely.

sytse · on June 19, 2015

I totally agree with you, I don't get it either. My only explanation is that as a programmer you are trained to write clean and understandable code. I also try to apply this to my commit messages (with varying results). But rewriting history to make everything look clean and simple is the wrong this to do. The messier your history is the more likely you'll need to retrace your steps (and CI results) at some point. It is mostly people coming from SVN and only running CI on the trunk branch that favor the rewriting approach. It might be hard to let go.

timr · on June 19, 2015

"But rewriting history to make everything look clean and simple is the wrong this to do."

Absolute statements are always wrong.

There plenty of great reasons to re-write your local history, many of which have been explored by other comments on this thread. Moreover, Linus disagrees with you -- the rebase flow is the one used by the author of git.

My guideline is that commits that don't have semantic meaning for your team should be avoided. It's therefore perfectly okay (desirable, even) to drop a "wip" commit in your local repository, but that commit is semantically meaningless, and shouldn't make it into shared history. Rebase!

Merge commits, likewise, should be avoided unless they carry semantic meaning. It's semantically meaningless to note that you've merged master back into your working branch N times since the initial branch point -- that's a development detail that is irrelevant to your peers. It's semantically meaningful to note that you've merged a feature branch into master. Your peers care about that.

You couldn't re-write history at all with SVN, so it's kind of goofy to suggest that this is legacy behavior. If anything, SVN had a problem that every pointless brain-fart commit by anyone, ever had to be preserved in the history. This made the history useless.

Mithaldu · on June 19, 2015

> It is mostly people coming from SVN and only running CI on the trunk branch that favor the rewriting approach. It might be hard to let go.

You're being ridiculously prejudiced and jumping to conclusions AND throwing out judgements on things that by your own admission ("I don't get it either.") you do not understand.

Please realize that the correct response in such a case is not to double down, but to engage in a dialogue so you may reach understanding of which factors you're unaware of or they're unaware of, that create the difference in stance. (And no, you can't expect the other side to initiate the dialogue. The way you are talking you present an image of someone singularly uninterested in dialogue, even if that may be unintentional.)

sytse · on June 19, 2015

Thanks for the comment. I agree I sound prejudiced and I'm sorry for that. I'm open to dialoge about this but I agree that my tone isn't helping. Anyway, I love thinking about this topic and I'm open to new insights.

Mithaldu · on June 19, 2015

Cheers. I had gotten that impression from your other comments. I'm glad to see it affirmed. :)

jordigh · on June 19, 2015

> The obsession of git users with rewriting history has always puzzled me.

Editing draft commits is fine. Editing public commits is less fine. The problem is that git has no way to distinguish draft and public commits except by social convention.

Mercurial Evolve actually enforces the separation between draft and public commits, and can also allow setting up servers where people can collaboratively edit draft commits.

My talk about it:

https://www.youtube.com/watch?v=4OlDm3akbqg

INTPnerd · on June 20, 2015

Everything about git is about managing a useful history. Otherwise it would be a history of every keystroke or at least every file write. Instead you write some code until you feel you have enough to make a useful commit (you will have to come up with your own idea of what represents a useful commit), commit all those changes together as a single commit (thereby losing history), and come up with a useful description of all those changes. Managing an already created commit is just a further extension of this idea. You can use what you learned from from your experience of coding, testing, and committing to change your commit history to be even more useful. Of course things can go wrong if you are changing the history of a branch that others have cloned or branched off of.

stream_fusion · on June 20, 2015

I can make 20 or 30 commits during some code changes in a morning's worth of coding. This allows me to easily trace back to any point, or cross-reference changes across many local branches, etc.

At the end, it might all be squashed down into a single bug-fix commit for the devel branch.

The commit granularity that's desirable and effective for an individual is very different to the history you want in the main feature branches.

tomphoolery · on June 19, 2015

> The obsession of git users with rewriting history has always puzzled me. I like that the feature exists, because it is very occasionally useful, but it's one of those things you should almost never use.

I disagree, and it's actually impossible not to use it. Rebase rewrites history. If you have a long-running feature branch you need to merge back into master, you have to rebase it against the current master. There's really no other choice.

> The whole point of history is to have a record of what happened.

Define "what happened" in this context...are we talking about what the feature's changes end up looking like, or the entire linear history of the work on this feature starting from the point at which the programmer experimented with a bunch of dead-ends before finding the right path?

Personally, I feel like an extremely detailed history of my personal problem-solving adventure on every complex ticket is irrelevant. At the end of the day, the code reviewer just wants to know what changed. When I review code, I prefer to look at a massive diff of everything that's been done, not read commit-by-commit. I'd rather see exactly what I'm going to pull in when I merge it into master.

I would also disagree with you here that the whole point of source control is to maintain a history of what happened, and argue that the point of source control is communicating changes between developers on a team. The fact that it backs up your code and keeps a history of what changed are merely secondary features to the central value of providing a way of communicating changes to a codebase between developers. I think Git is the best version control system for doing this, because it allows you to rewrite history. That said, rewriting history is very dangerous and if you use it incorrectly (like never ever rewriting history on a branch other people have to pull from), you're

> If you're going around and changing it, then you no longer have a record of what happened, but a record of what you kind of wish had actually happened.

If you're using Git, this is a complete falsehood if you are the person who made the commits. The reflog provides a reference to every single change made to your repository, so you can just reset back to the point before you rebased and voila, like magic everything is back to the way it was. This isn't a "hack", that's what reflog is for. It's a giant undo list for your local clone of the repo.

So in essence, history is never destroyed. It's just hidden from view. You can always go back in Git unless you actually `rm -rf .git/`.

> Some programmers really want to come across as careful, conscientious, thoughtful programmers, but can't actually accomplish it, so instead they do the usual mess, try to clean it up, then go back and make it look like the code was always clean.

You might be correct in some cases, but I think for the majority of the time you are confusing explicitness with vanity. Programmers want other people on their team to know what they did, or at least the intention of their code, and having commit messages that "tell a story" and make sense are vital for doing that.

barrkel · on June 19, 2015

If you have a long-running feature branch you need to merge back into master, you have to rebase it against the current master. There's really no other choice.

Yes you do. Merge master into your branch. Rebasing long-running branches is a nightmare, because every diff you replay will probably result in a conflict, and if you have hundreds of commits, you could be there for several days rebasing. Merges, even massive merges, generally don't take more than a few hours. All project size dependent of course, but the ratio of work is about right: 5-10x more work for a rebase over a commit.

hayd · on June 19, 2015

Whilst this is occasionally useful, it's best avoided in my opinion as it's incredibly difficult to review a merge commit (especially a large merge commit).

(Most of the time I'd advocate, if you have to do larger project branches, either merging work in piecemeal asap e.g. hidden behind a flag or whatever, or keeping work in new files so that merge conflicts are kept to a minimum.)

mikeash · on June 19, 2015

I don't understand this bit about having no other choice but to rebase a branch against master. When I have a long-running branch that no longer cleanly merges into master, I merge master into the branch first. I've never seen a case where rebase is required.

I find your conclusion to be most confusing. You say that programmers want other people on their team to know what they did, and then you say that they accomplish this by constructing a fake story about stuff that never happened. Sounds like you mean that programmers want other people on their team to know what they wish they had done. Which is understandable, but not at all the same.