Upsides from storing node_modules in repo are outweighed by the downsides. Unless of course you're Google-scale and can afford to contribute filesize fixes upstream, write fancy tooling to enforce commit-time workarounds, etc. Nobody working finger-to-feature has time for this.
For your average npm shop which doesn't have infinite internet oil money, here is why the article recommendations won't work for you.
Your CI will pay the time penalty during git clone instead of npm ci. In fact, the node_modules folder will be bigger than your source folder almost immediately. And over time you won't be cloning just the head files you'll also be cloning every npm package binary ever committed. You can't undo this without investing in smarter git tooling. Which is time spent not writing features.
NPM packages which install arch-specific binaries will constantly flip flop from commits by devs on different OS's.
Nobody is safe from left-pad, not even Google, and committing your node_modules folder doesn't change that. Eventually someone is going to have to run npm i.
Running npm ci on everyone's machine is reproduceable, I don't know what OP is warning about. Package lock pins all the versions.
If you have a large enough team to invest in dev experience, there's way better ways to get the advantages of the article without the downsides. You can cache the npm ci result in a container layer for your CI/CD or use middleware like artifactory
Maybe, but everyone's CI situation is so variable that it may not be that easy. For instance if you are using a monorepo then even a shallow clone can be overkill. And if you rely on git history for conditional CI then a shallow clone will ruin the output of many git commands. So you could end up in an either/or optimization situation depending on the order your CI/CD organically grew and other architectural decisions you made.
Counter point, and ignoring download size as your typical CI probably doesn't download the entire history, when are we supposed to pretend we've reviewed our dependencies?
I'll admit I don't believe everyone always need to check every dep, but we're skating close nobody checking them ever.
My team guards up front when introducing a new dependency. You fill out a little template with security assessment as well as some other stuff, just to do a dirt simple build vs 'buy' analysis. left-pad for example would fail because the build time cost savings are not worth the ongoing maintenance cost. (In fact doing this assessment at all rarely makes sense for microlibs, by design.)
Once something's in package.json I don't believe anyone who says they can vouch for the security of that over time. We're all doing security theater with npm audit, dependabot, etc. Don't use npm at all if anyone's life depends on your code.
I think formlising an assessment like that makes some sense, but the question was more around what the assurances are. So it probably works like this:
#1 You look at the dependency and do an assessment on whether it's worth including. Check.
#2 You probably require some automated checks. SAST, Depedency Scanning / SCA, maybe some DAST, etc. Check
The outstanding question though...
#1 Did anyone actually read the code of the depedency?
#2 Did anyone actually look at what the depedency itself pulls in?
#3 Are these checks re-done when you update the lock files?
#4 If nobody is doing it, who's updating the lists and rules we use to scan from?
#5 Where possible do you have the monitoring to check when an app is doing something weird? i.e. network ACLs that when they fail, cause an event, that alerts a person to investigate?
I think we're mostly agreeing here, but the wider question is why is it that folks writing the app and including the depdency don't feel responsible for these things?
I think you mean they're automatically doing SCA and maybe SAST. I don't think there's a human working at microsoft reading the code for you though, is there?
> Your CI will pay the time penalty during git clone instead of npm ci.
Things like GitLab's CI runners will do a single clone then do `fetch`, `checkout`, and `clean` to checkout your repo. Git repo size isn't a huge bottleneck in CI performance.
Only if you have long-living runners: if you use a dynamic fleet to save money, then almost every time you clone the repo. However, this is why you can do sparse checkouts and limit the git depth.
For your average npm shop which doesn't have infinite internet oil money, here is why the article recommendations won't work for you.
Your CI will pay the time penalty during git clone instead of npm ci. In fact, the node_modules folder will be bigger than your source folder almost immediately. And over time you won't be cloning just the head files you'll also be cloning every npm package binary ever committed. You can't undo this without investing in smarter git tooling. Which is time spent not writing features.
NPM packages which install arch-specific binaries will constantly flip flop from commits by devs on different OS's.
Nobody is safe from left-pad, not even Google, and committing your node_modules folder doesn't change that. Eventually someone is going to have to run npm i.
Running npm ci on everyone's machine is reproduceable, I don't know what OP is warning about. Package lock pins all the versions.
If you have a large enough team to invest in dev experience, there's way better ways to get the advantages of the article without the downsides. You can cache the npm ci result in a container layer for your CI/CD or use middleware like artifactory