"Ask HN: Is there any way to detect websites that are SEO-optimized on Google?"
Unfortunately, those seem to even get into HN. This page is a perfect example. This is how the page starts:
Everything you need to know about monorepos,
and the tools to build them.
Understanding Monorepos
Monorepos are hot right now, especially among Web
developers. We created this resource to help developers
understand what monorepos are, what benefitsthey can
bring, and the tools available to make monorepo
development delightful.
There are many great monorepo tools, built by great
teams, with different philosophies...
I got so tired at this point that I stopped reading.
In my mind, I see the job description on Fiverr "Fast SEO writer wanted! Please write a 3000 word page about monorepos. Make sure to mention 'monorepos' and related terms like 'web developers', 'tools', 'development' etc frequently."
What a terrible take. As someone with an interest in monorepos, who's currently working with a company adopting Nx, I found the page interesting and compelling. I spent a good 30-60 minutes following links and investigating deeper.
You didn't even read past the fourth paragraph. You should be ashamed for derailing what could have been a productive discussion about an interesting topic with your shallow dismissal.
You must be primed to detect this, I did not find this at all bothersome. I actually like the page, I always wondered if a monorepo is best for us, now I have some extra arguments to say yes if the discussion arises.
I think... it has been. OTOH, if you cycle to the end, it's a collaboration by the developers or "community outreach" of those tool chains:
"The tools we'll focus on are: Bazel (by Google), Gradle Build Tool (by Gradle, Inc), Lage (by Microsoft), Lerna, Nx (by Nrwl), Rush (by Microsoft), and Turborepo (by Vercel). We chose these tools because of their usage or recognition in the Web development community."
You can scroll directly to the bottom and see the final comparison table.
Here's a screenshot for your convenience [1]
I agree with you that I prefer to get straight to the point, but this pet-peeve tangent doesn't seem to be a productive discussion of the actual merits of the tooling.
Monorepo is a way to morph dependency management problem into source control problem within your organization. Currently, FOSS tools solve none of them.
Agree - the site focuses a lot on build, but ignores scm tooling. At a certain size, git no longer works well as a pure monorepo w/o submodules and these mega companies have teams of people optimizing code time vs build time checkouts of these monorepos to handle subsections.
FWIW I like that there aren't much gatekeeping in the industry and think it is a genuinely good thing, not only because my "hobby" (application security) relies on it :)
Also, with an unsolved problem you get paid to make whatever crazy attempt you want, which is absolutely a perk.
What does one thing have to do with the other? Monorepos are famously used at giant tech companies. Clearly they are being introduced by tech management in those cases, not a couple of people in their garage that don’t know what they are doing.
Could you elaborate? I use a monorepo at work and if anything dealing with 3rd party dependencies is easier because you don't have to coordinate upgrading versions across teams. For 1st party stuff in the repo we don't have a need to version libraries at all, if it all builds and passes all the tests everything is good. The whole point is to use the whole tree from a consistent snapshot as a release, so you never worry about using a new first party library with an old first party binary.
If you're able to release everything as a consistent snapshot, it probably is not a monorepo. Instead it is just a normal repo containing a single big project.
Here's an example: let's say that is you've got a single product with a backend server, database, web frontend, and a iOS app. How would you release all those projects as an atomic unit?
If there's a new field in this release, the database schema needs to be changed on the servers before the backend is released. The backend needs to change before the frontends do. You have no control over the deployment speed of some of those components, so releasing them all at the same time is impossible.
Similar issues would happen if you update 3rd party library, and software using the new Vs. old versions of the library are incompatible.
So the value of this monorepo isn't that you could cut a release for all of the components at once. It is that everyone doing development has a shared view of the current state of the system.
Yeah, that's what I meant by "morphing dependency management problem to a source control problem". With a monorepo dependency management is way easier! But, sometimes:
- git on large repo 100% pain 0% fun. hg is slightly better, but not much.
- No version means no prebuilt libraries, which translates to "you need a great build cache to keep build time reasonable".
- "passes all the tests everything is good" if only we can run all the tests on such changes.
- People hate coordinating on imported/pinned third-party dependency versions, sometimes you need tools for large-scale automated changes to make progress, but :(
- Similarly, not all places make all their codes accessible to all engineers.
i.e. source control problem is, sometimes, harder.
When I talk about monorepos in our company, I always try to make the distinction between JS monorepo tooling (say nx, turbo or more low-level pnpm/npm/yarn workspaces) and real (?) monorepo tooling (say Bazel). Whereas the latter has more focus on dealing with a wide variety of source code types and artifacts the former is exclusively dealing with NPM packages (which may include other stuff like Go/Rust sometimes). Does this distinction even make sense?
I don't know what it is, but it feels like the JS tooling community so often pretends that the rest of the world does not exist in their marketing. I find myself having to dig into docs or the GitHub repo before I figure out what language or ecosystem I'm even reading about.
somehow, the authors of this website neglect to even mention Nix. maybe that has something to do with the fact that this is a marketing page for the tool they named Nx (seriously?).
Yes, I think the JS ecosystem (which I am certainly part of) does sometimes ignore established terminology and solutions from other ecosystems. Although I must say that the JS ecosystem really has amazing tooling in certain areas (say prettier, eslint which I am missing in the Java world for instance).
I was actually about to mention Nix in my post as well. Being a casual NixOS user myself I wonder if there is any kind of monorepo tooling based on Nix? Without ever having used Bazel myself I always thought of it as Nix-like.
Yes, there is! We (https://tvl.fyi) have been building Nix monorepo tooling for a while. You can see the current state of our repo at cs.tvl.fyi (+ reviews at cl.tvl.fyi and dynamic CI on tvl.fyi/builds).
We use josh[0] to let people clone "just in time" repos with the tooling needed for our setup[1]. We've also started a consultancy (tvl.su) that helps companies move onto this setup, and have customers going for it already.
The reasons we've not been making a lot of noise about this are that we have other large projects(like Tvix[2]) taking up time, and also the integration with customers moving to this setup lets us more confidently figure out what parts we need to smoothen for "non-TVL" use-cases.
As for using Nix in a Bazel-like way, the common experience with Nix is that language-specific build systems are wrapped. This being possible enables projects written in any language to be wrapped in Nix, and integrated in a Nix-based monorepo (something which makes it distinctly more powerful than other solutions).
However, there's nothing in principle preventing Nix from dropping down a layer to the project level itself, and we've implemented (and use) this for Go[3] and Common Lisp[4].
Title should reflect what the website has used, "Monorepo explained". It certainly doesn't cover "everything you need to know about monorepos" and glosses over its disadvantages and the things you need to watch out for.
The most important one being that you need to have an org/team structure that is set up to support it. You cannot say that it will make the org more efficient as organizations are not all the same. In order to push monorepos, the decision makers ought to know what those caveats and tradeoffs are, or they're going to be in for a sad time.
Site does do a good job of going over the tooling around it. Now this might be a matter of perception, it seems that the tooling is getting better, though not yet very mature. I see a few instances of "write your own" where the tooling is lacking, which is not a great way to go about things, and once again, makes assumptions about the nature of the orgs.
Something very important not covered by the article:
Is the tool going to help me detect when I accidentally bypass the declared dependencies?
For example, in a basic monorepo it's very easy to accidentally rely on the file layout on disk (require'ing a dependency not in your package.json but that has been hoisted because it's a dependency of a different package accidentally succeeds, cp'ing files from `../some-other-project` should not be allowed but is possible). All of these invalidate some optimizations that monorepo tools want to make.
At scale with many contributors, it's HARD to teach and remember and apply all these rules, and so the monorepo tool really should help you detect and fix them (basically: fail the build if you mess up).
The article doesn't really make it clear which tools will do that for you. Pretty sure that Bazel does, Nx probably does, and lerna and turborepo don't.
In our mono repo at work, we have a few hundred devs in there daily working just fine. Linters check that there are no relative paths allowed (so you can’t rely on directory structure) and no absolute paths either. If you want to load a file in the tree, you must use a “blessed” constant or function to get the base path of your current code or some other code.
TBF, if you have centralized dependencies or your dependency on another module affects your dependencies, you are probably doing it wrong. APIs between parts should be well defined and not require the entire dependency runtime to be loaded to interact with it.
I'm building a pretty big service that has four user-facing websites and even more backends (HTTP servers, highly bespoke job queues to run ML workloads, etc.)
This was an absolute nightmare to try managing in separate repos. I've finally settled on two monorepos: a Yarn/TypeScript/React frontend monorepo, and a Rust/Docker backend monorepo.
Does anyone have any advice on these? I sort of stumbled into this pattern on my own and haven't optimized any of it yet.
For Rust, I'm curious if folks have used Bazel for true monorepo build optimization. I don't want to rebuild the world on every push to master.
Likewise for the frontend, is there any way to not trigger Netlify builds for all projects if only one project (or its dependencies) change?
If the (web) API surface between your BE and FE is based on a schema (a.k.a. typed API, like with OpenAPIv3 or GraphQL) then I'd put them in a mono repo. This way you can recompile the FE automatically if the schema changed (usually an FE client lib is generated from the API schema). This helps discovering errors at compile time.
If your API is not schema-based, you have no way of knowing something broke without FE/UI testing.
Bazel should be smart enough to build only what changed. Is it possible that your CI doesn't cache previous runs? With Bazel I successfully used Google cloud build to achieve that by storing the bazel-* folders to Google Cloud Storage as last step of every build and downloading them as first step.
The target bucket I use has a very short object lifecycle setting so I don't even have to clean up old artifacts manually.
I'm using Github to run builds. I'll have to investigate your setup, because that sounds perfect. I don't know if Github can do that.
What do you do if you need an artifact that gets garbage collected? Manually force a rebuild of that SHA? Have things on continuous deploy and update regularly? I may need better CI/CD practices.
To be honest I never optimized my setup to a single artifact level. The way I set up this was that in a Google Cloud Storage bucket I have a subfolder for each build whose name is monotonically increasing (ie by including time in the folder name like 20220225_23_49_50/bazel-* folders). That way I can copy the latest build to the cloud build VM and still retain history. The object lifecycle settings I use keep artifacts around for 1 month and I never had the need to find something outside that time window.
There could be smarter ways to do so tbh, like having time&date_<Sha of commit>, but I didn't have the need for any of that yet.
Here's a way you can do this with git. This trick relies on `git merge --allow-unrelated-histories`.
Assuming you have repos `foo` and `bar` and want to move them to the new repo `mono`.
$ ls
foo
bar
# Prepare for import: we want to move all files into a new subdir `foo` so
# we don't get conflicts later. This uses Zsh's extended globs. See
# https://stackoverflow.com/questions/670460/move-all-files-except-one for
# bash syntax.
$ cd foo
$ setopt extended_glob
$ mkdir foo
$ mv ^foo foo
$ git add .
$ git commit -m "Prepare foo for import"
# Follow those "move to subdir" steps for `bar` as well.
# Now make the final monorepo
$ cd ..
$ mkdir mono
$ cd mono
$ git init
$ touch README.md
$ git add README.md
$ git commit -m "Initial commit in mono"
$ git remote add foo ../foo
$ git fetch foo
$ git remote add bar ../bar
$ git fetch bar
# Substitute `main` for `master` or whatever branch you want to import.
$ git merge --allow-unrelated-histories foo/main
$ git merge --allow-unrelated-histories bar/main
# Inspect the final history:
$ git log --oneline --graph
* 8aa67e5 (HEAD -> main) Import bar
|\
| * eec0abd (bar/main) Prepare bar for import
| * 9741d6d More stuff in bar
| * 634ba3d Initial commit bar
* 43be6e9 Import foo
|\
| * d4805a0 (foo/main) Prepare foo for import
| * 4d2ca10 More stuff in foo
| * 72072a1 Initial commit foo
* bfcb339 Initial commit in mono
Do you think this will speed up things? I tried the above suggestion and it's already for four hours to merge two repo's into one (3 years worth of git history)
There are several ways to do this. Having extensively experimented with all of them I can say that the best are josh[0] (if you need external history continuity) and git subtree[1] (if you just need the commits to remain valid within your repository).
Thank you, Josh looks interesting. I will need to look into this. At first read it looks like the end result is not a brand new Git repo that combines/merges a bunch of repos. I am not sure if a proxy is going to work well with Gitlab CI
If you can define exactly what you mean by "keeping history" (i.e. which operations do you want to support, and in what context?) I might be able to tell you how to do it :)
I'm curious about that as well. Maybe it'd be possible to start a repo with a single empty commit, rebase everything on that in a separate branche for each of the git repos and then merge them all into the master branche? Although some file renaming may be in order, otherwise everything ends up in the same folder.
I will have to look into this. I always understood that this won't generate a new repo but somehow combine the other repos. The idea to merge the existing repos into a monorepo and then archive the old repos. I don't think that's possible when using subtree's
Subtree merges a whole repo into the subdirectory of another repo. You can git blame yourself back to the original repo. Unlike submodules, there's nothing in the file tree which signifies there is something special about this directory (it searches commit messages to get that metadata). From the monorepo POV, archiving is just never doing another pull. Using submodules is a nightmare.
Great information.
I've been building a monorepo of my own for a system that consists of RESTful and event-driven microservices. They are defined via Open API and Async API respectively.
Does anyone know a tool to generate documentations for each type of service and put them together into one cohesive documentation?
This is website is amazing. It's doing a very needed job in the internet. I feel it should also be the trunkbaseddevelopment.tools and pairprogramming.tools :)
https://news.ycombinator.com/item?id=30424279
"Ask HN: Is there any way to detect websites that are SEO-optimized on Google?"
Unfortunately, those seem to even get into HN. This page is a perfect example. This is how the page starts:
I got so tired at this point that I stopped reading.In my mind, I see the job description on Fiverr "Fast SEO writer wanted! Please write a 3000 word page about monorepos. Make sure to mention 'monorepos' and related terms like 'web developers', 'tools', 'development' etc frequently."