It's not necessary, it's a waste of development time and easily the biggest big-picture collective failure of our software engineering profession of the last decade. It easily beats out anything from the $foo.js world and the ongoing low-level security nightmare of web application development, because git, and more importantly, unnecessarily complex and error-prone git workflows have seen adoption across all kinds of software. There are probably a dozen projects in the world (the kernel admittedly being one of them) that are justifiably a good fit for the complex git-native workflows that have become standard practice across the industry today.
But if you're doing it the old-fashioned way, you might as well use Subversion. Or mercurial, if you want all the local history, with the added bonus that unlike git it sensibly keeps the least-surprise semantics of 'commit', 'revert', and other commands that merely have a 30-year history of expectations that held true prior to git. But Mercurial was not authored by Linus, nor does it have the impenetrable, otherworldly data model that a first-time version-control-system author would unavoidably end up concocting in scratching their itch without consulting the existing, completely satisfactory, solutions that served us well for decades, which greatly reduces the number of interesting topics you can blog about for Mercurial.
And so, git sees the adoption, github gets the $2bil valuation, and even bitbucket ends up switching to git as it's default. It won.
Software engineering, collectively, has a lot of maturing to do.
I have been using git for years and I still find its interface completely inscrutable. I have given up trying to learn it in a way that makes it make sense, and simply use a handful of everyday commands I've memorized by rote and look everything else up when I need it. I can't think of any other piece of software I actually use which has such a messy, non-predictable interface. Even 'make' eventually succumbed to rational analysis when I finally managed to suppress my nausea long enough to dig in and learn it as though it were a real language.
> I have given up trying to learn it in a way that makes it make sense, and simply use a handful of everyday commands I've memorized by rote and look everything else up when I need it.
? Here are the commands that I have used and make sense to me:
branch, tag, log, diff, push, pull, fetch, commit, rebase (with or without -i), reset, add, rm, mv, stash, status, remote, bisect, reflog, blame, and fsck.
Is this set of commands more or less the same as the set of commands that you've learned by rote memorization?
Sometimes: commit --amend, rebase -i, add, rm, mv, stash [pop|apply].
Rarely: branch, revert.
Git's documentation uses such a wide and flagrantly inconsistent variety of terminology and maintains such a poor distinction between its interface and its implementation that trying to read it actually worsens my understanding and reduces my confidence. I get everything useful from stackoverflow and ignore the docs at this point, and have thus resigned myself to using git as a form of voodoo.
Subversion was so much clearer; I wish it had done a better job with merges and hadn't been so server-dependent. Mercurial seemed to actually care about its interface design, and the DVCS experience might suck less if it had won, but I've never had a chance to actually use it.
I agree that some of the commands have a very large array of options. I see many these options as very specialized tools (and certainly don't claim to know all (or -in some cases- most) of them). I, too find the "git config" command to be largely useless. Its value is in scripts or in Git frontends. Also, the git-config manpage has a complete listing of all valid git config options. So, there's that. :)
> Commit, checkout, and reset seem rather more complicated than necessary and don't do anything useful in their default forms...
When called with no args, commit records changes staged with add/rm/mv in the local repo's history. Checkout changes the tracked contents of the working copy to that of another point in the repo's history (so it is meaningless to call it without an argument), and git reset is destructive (and has no --force option), so it makes sense to require an argument. [0]
In regards to commit and checkout:
I came to git by way of Subversion. These two confused me for quite a while. What helped me to understand the logic behind them was to realize that -unlike SVN- git has
* The working copy, which is manipulated with a bunch of commands
* The area where changes that will be included in the next commit live, which is cleared out after every successful commit, and is manipulated by add, rm, and others
* Your local repo, which is manipulated with commit and checkout
* One or more remote repos, which is manipulated with push, pull, and merge
But maybe you already had this solidly in mind, and this explanation was a waste of your time. :(
> As always, trying to read the documentation leaves me with less understanding than I had before I started.
Have you familiarized yourself with a significant fraction of git's vocabulary? The man pages became much clearer once I did so. [1]
> ...for reasons I cannot comprehend they also get involved in merge resolution, where they perform tasks with no visible relationship to their names or their normal jobs.
Oh. That's because a merge operation adds a series of commits from one or more branches into another branch and effectively makes a new commit with the result of the operation. If conflicts can be automatically resolved, then they are. If they cannot, then it's up to you to stage the changes you want to see in the merge commit (using add and friends) just like you would do when preparing any other commit. Does that make sense?
> I have no idea what reflog would do; is there a flog command too?
Nah. It's a command for examining and manipulating the reflog, which is -effectively- where git makes a record of every change that happened to your repo. You pretty much never need to use the command, but I have used to see just how git handled a set of complicated squash and commit reorder operations. From the man page:
Reference logs, or "reflogs", record when the tips of branches and
other references were updated in the local repository. Reflogs are
useful in various Git commands, to specify the old value of a
reference. For example, HEAD@{2} means "where HEAD used to be two moves
ago", master@{one.week.ago} means "where master used to point to one
week ago in this local repository", and so on. See gitrevisions(7) for
more details.
If you've gone on a mad history rewriting spree and have confused yourself (or simply accidentally moved a branch a while back and can't remember where it used to point), you can use reflog to trawl through the change history to save yourself.
> I am generally more inclined to use stash or a second working directory than to deal with branches, since it's less busy-work.
I'm curious. What busy-work do you have to do? I typically just have to do: "git branch whatever; git checkout another-branch; git branch -D some-other-branch".
[0] Though -conceptually- a substantial portion of reset's functionality overlaps with checkout's functionality. So, that's silly and nonsensical.
[1] Not that I'm implying that such a thing is be required to use git, mind.
Thanks for your thoughtful dig into these commands.
Perhaps my biggest point of confusion with git comes from that nebulous intermediate structure which sits between the real working directory and the real repository, which sort of acts like a repository and sort of acts like a working directory. It has many names and no clear purpose, and it doesn't fit into my mental model of the work to be done when working with a VCS.
Your explanation of merges makes more sense from that context. I don't think of add/mv/rm as operations on the nebulous repository, because I don't have any idea why one operates on the nebulous repository; what I'm trying to do is tell git to track a file, or stop tracking a file, or notice that I've moved a file from one place to another. The fact that these operations also kind of half-commit changes to this semi-repository is just... confusing, because I don't know why one would care.
If the pseudo-semi-repository thingy actually made sense, perhaps it would seem more natural that add/mv/rm do things to it during merges. I suspect the behavior of checkout, reset, and commit might also make more sense; as is, they seem to be needlessly complex, because I am never manipulating the semi-repository on purpose: I'm either trying to move my changes from the working directory into the local repository, or I'm trying to update my working directory to match the local repository, but in no case am I ever trying to half-update the intermediate state I can't actually see.
Given this somewhat confused explanation and the fact that you've done a great job of explaining what git is doing so far, can you point me at something not written by the git authors that explains what the hell is going on here and why? I would like to understand the tools I'm using instead of just blindly typing arcane rituals cribbed from the internet, but as I said before trying to read the git documentation just leaves me more confused than before I started.
Is the semi-repository thing you are talking about the index?
If so, I would describe the index as a sort of staging area for preparing your commit.
You might not necessarily want to include every single change in your workspace in your next commit.
The index allows you to pick which things you want to go into the commit then git-commit creates the commit from what's in the index.
If you don't care for such behaviour, and you just want to commit all changes in tracked files in your workspace, git-commit -a does that.
This[0] is one of the best tools I've seen for understanding git commands and even a bit of how git works.
It's interactive, divides things into the different 'places' that content can be in git and then shows you how each command moves content between those places. Click on the workspace and it will show each command which does things to content in your workspace, the bar the command is written on shows what the other area it interacts with is and the direction that the command moves content. I hope it helps.
[0] http://ndpsoftware.com/git-cheatsheet.html
paddyoloughlin's explanation of the index is a good explanation.
I might add that if you didn't have the index, then you could not make a commit that contained an add, a rename, and a deletion. If you think far too deeply about how Subversion handles these operations, it becomes clear that -conceptually- Subversion had an index/"staging area", too. Ferinstance, the output of 'svn help add' says
add: Put files and directories under version control, scheduling
them for addition to repository. They will be added in next commit.
usage: add PATH...
What's git's index but a list of changes that have been enqueued to be performed with the next commit?
The big conceptual distinction between SVN and git is that with SVN you tie exactly one repository to a given working copy. In git you can tie multiple repos to a given working copy and (by default) one of those repos is stored in the same place as the working copy.
Does that make sense, or is the index still somewhat-to-rather unclear and/or mystifying? (I mean, other than its kinda crappy name.)
> ...can you point me at something not written by the git authors that explains what the hell is going on here and why?
If "here" is "with git in general", I read the Git Book [0] ages ago, and combined what I learned from it with a fair amount of fucking around with my repo, and also with the contents of the gittutorial(7), gittutorial-2(7) and (parts of) gitglossary(7) man pages. [1]
From looking at the ToC of the Git Book, it looks like chapters 1, 2, 3, and 7 would be relevant to your interest. Chapters 5 and 10 might be relevant. I can't offer any guarantees, as I last read the Git Book ages ago, and this look like it's a new version... the one I read didn't make any mention of Github.
Though, if you were asking about something more specific, I'm happy to take a stab at answering that question once I know what it is. :)
[1] Even though you asked for things not written by the git guys, I got a fair bit of value from the official git tutorials. It's also possible that you've overlooked them, so I bring them up.
Very sorry to have rewritten my comment out from under you - I took a look at it, decided it was needlessly verbose, rewrote it, and promptly dropped into a subway tunnel... So my edit actually went through somewhat later. Now I wish I hadn't bothered!
I'm often overly verbose, but am typically too lazy to write shorter comments. I considered re-working my comment to address your edited comment, but I think that it covers both comments.
But if you're doing it the old-fashioned way, you might as well use Subversion. Or mercurial, if you want all the local history, with the added bonus that unlike git it sensibly keeps the least-surprise semantics of 'commit', 'revert', and other commands that merely have a 30-year history of expectations that held true prior to git. But Mercurial was not authored by Linus, nor does it have the impenetrable, otherworldly data model that a first-time version-control-system author would unavoidably end up concocting in scratching their itch without consulting the existing, completely satisfactory, solutions that served us well for decades, which greatly reduces the number of interesting topics you can blog about for Mercurial.
And so, git sees the adoption, github gets the $2bil valuation, and even bitbucket ends up switching to git as it's default. It won.
Software engineering, collectively, has a lot of maturing to do.