What stands out to me is this particular justification:
> 2024-02-23: Jia Tan merges hidden backdoor binary code well hidden inside some binary test input files. The associated README claims “This directory contains bunch of files to test handling of .xz, .lzma (LZMA_Alone), and .lz (lzip) files in decoder implementations. Many of the files have been created by hand with a hex editor, thus there is no better "source code" than the files themselves.”
This is, perhaps, the real thing we should think about fixing here because the justification is on the surface reasonable and the need is quite reasonable - corrupted test files to test corruption handling.
But there has got to be some a way to express this which doesn't depend on, in essence, "trust me bro" since binary files don't appear in diffs (which is to say: I can think of a number of means of doing it, but there's definitely no conventions in the community I'm aware of).
Well, test files shouldn't be affecting the actual production binary.
But in practice that's not something that can be enforced for arbitrary projects without those projects having set something up specifically.
For example, the project could track the effect on binary size of the production binary after every PR. But then it still requires a human (or I guess an AI bot?) to notice that the increase would be unexpected.
Debian often removes these kind of binaries by patching the upstream tarball. When they are not used, that should be quite easy anyway.
That's why the attacker put the statement in the first place. It increases the chance that distributions will accept these.
Also that when dynamically linking A against B, A apparently gets free reign to overwrite B.
It sort of makes sense, since at the end of the day it could just be statically linked or implement B's behaviour itself and do whatever it wants, but it's not really what you expect is it.
Yeah, that part struck me as something we should be able to block - the number of times where you actually want that must be small enough to make it practical do something like write-protect pages with a small exception list.
He put this comment because he knows that FOSS enthusiasts and especially Debian always prefer source over binary. This is not only true for program code, but also includes docs, images etc.
The correct way to do that would be a source that generates a test file and then a script which reproducibly produces the desired corruption.
> 2024-02-23: Jia Tan merges hidden backdoor binary code well hidden inside some binary test input files. The associated README claims “This directory contains bunch of files to test handling of .xz, .lzma (LZMA_Alone), and .lz (lzip) files in decoder implementations. Many of the files have been created by hand with a hex editor, thus there is no better "source code" than the files themselves.”
This is, perhaps, the real thing we should think about fixing here because the justification is on the surface reasonable and the need is quite reasonable - corrupted test files to test corruption handling.
But there has got to be some a way to express this which doesn't depend on, in essence, "trust me bro" since binary files don't appear in diffs (which is to say: I can think of a number of means of doing it, but there's definitely no conventions in the community I'm aware of).