Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've run into problems with large-ish files in git repos, binaries accidentally committed etc. I'm genuinely curious, is there a good way to use git for the size repositories that you mention?


Git is great for smaller binaries. Ideal in fact, given that it stores differences between revisions as binary deltas. For large >1GB files, I believe the diffing algorithm is the limiting factor (I would be interested in getting confirmation of that, though). For those files something like git-annex is useful (http://git-annex.branchable.com/)

I've used git to push around a lot of binary application packages and it's very nice. Previously I was copying around 250-300MB of binaries for every deployment--after switching to a git workflow (via Elita) the binary changesets were typically around 12MB or so.


Hmm, I'll have to take a look at git-annex, I've run across it but never investigated. Thanks!


I've had no trouble with non-github git repos handling text files well into the hundreds of megabytes. I haven't pushed the limit on this yet, but for me, personally, my datasets are often many hundreds to thousands of individual files that are medium sized. So it tends to work fairly well. Singular large files may not scale well with git.


I don't know of a great solution for Git, but I've heard that Perforce is more suited to handling large binaries - I believe it's used in many game development studios, where binary assets can number in the high gigabytes.


Game developers need both support for large files and for exclusive locking, since assets like images, audio and 3d models are almost never mergeable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: