Hacker News new | past | comments | ask | show | jobs | submit login

Makes me wonder how many repositories exist in general, from all the local Forgejo and Gitlab servers. Heck, include Subversion and Mercurial and git's other friends (and foes!)

Did anyone make a search engine for these yet, so we'd be able to get an estimate by searching for the word "a" or so?

(This always seemed like the big upside of centralised GitHub to me: people can actually find your code. I've been thinking of making a search since MS bought GH but didn't think I could do the marketing aspects and so it would be a waste of effort and I never did it. Recently I was considering whether this would be worth revisiting, with the various projects I'm putting on Codeberg, but maybe someone beat me to the punch)




Well, based on the API enumeration mentioned in sibling comments, surely one doesn't have to estimate

https://docs.gitlab.com/api/projects/#list-all-projects (for dumb reasons it seems GL calls them Projects, not Repositories)

https://codeberg.org/api/swagger#/repository/repoGetByID (that was linked to by the Forgejo.org site, so presumably it's the same for it and Codeberg) and its friend https://gitea.com/api/swagger#/repository/repoGetByID

Heptapod is a "friendly fork" of GitLab CE so its API works the same: https://heptapod.net/pages/faq#api-hgrc

and then I'd guess one would need to index the per-project GitLab instances: Gnome, GNU (if they ever open theirs back up), whatever's going on with Savannah, probably Sourceforge, maybe sourcehut (assuming he doesn't have some political reason to block you), etc

If I won the lottery, I'd probably bankroll a sourcegraph instance (from back when they were Apache) across everything I could get my hands upon, and donate snapshots of it to the Internet Archive


At Software Heritage, we listed 380M public repositories, 280M of which are on Github: https://archive.softwareheritage.org/

Repository search is pretty limited so far: only full-text search on URLs or in a small list of metadata files like package.json.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: