As a naive bystander, the thing that stands out most to me:
> Many of the files have been created by hand with a hex editor, thus there is no better "source code" than the files themselves.” This is a fact of life for parsing libraries like liblzma. The attacker looked like they were just adding a few new test files.
Yes, these files are scary, but I can see the reason. But at least can we keep them away from the build?
> Usually, the configure script and its support libraries are only added to the tarball distributions, not the source repository. The xz distribution works this way too.
Obligatory auto tools wtf aside, why on earth should the tarballs contain the test files at all? I mean, a malicious test could infect a developer machine, but if the tars are for building final artifacts for everyone else, then shouldn’t the policy be to only include what’s necessary? Especially if the test files are unauditable blobs.
It's pretty common to run tests on CI after building to verify your particular setup doesn't break stuff.
Last time we were doing that we were preferring git upstream, though, and generated autocrap as needed - I never liked the idea of release tarballs containing stuff not in git.
This strengthens the argument I’m making, no? You bring in the source repo when doing development and debugging. In either case - tarball or not - it doesn’t seem that difficult to nuke the test dir before building a release for distribution. Again, only really necessary if you have opaque blobs where fishy things can hide.
The distributions often run the same tests after it’s built to make sure it’s working correctly as built in the distribution environment. This can and does find real problems.
Because, despite containing some amount of generated autoconf code, they are still source tarballs. You want to be able to run the tests after compiling the code on the destination machine.
Besides for verifying that the compiled program works on the target, tests are also required to compile with PGO because you need to have a runtime example to optimize for.
> Many of the files have been created by hand with a hex editor, thus there is no better "source code" than the files themselves.” This is a fact of life for parsing libraries like liblzma. The attacker looked like they were just adding a few new test files.
Yes, these files are scary, but I can see the reason. But at least can we keep them away from the build?
> Usually, the configure script and its support libraries are only added to the tarball distributions, not the source repository. The xz distribution works this way too.
Obligatory auto tools wtf aside, why on earth should the tarballs contain the test files at all? I mean, a malicious test could infect a developer machine, but if the tars are for building final artifacts for everyone else, then shouldn’t the policy be to only include what’s necessary? Especially if the test files are unauditable blobs.