Furthermore, the attacker covered their tracks on the initial payload with an innocuous paragraph in the README. ("Nothing to see here!")
bad-3-corrupt_lzma2.xz has three Streams in it. The first and third
streams are valid xz Streams. The middle Stream has a correct Stream
Header, Block Header, Index and Stream Footer. Only the LZMA2 data
is corrupt. This file should decompress if --single-stream is used.
The strings of `####Hello####` and `####World####` are there so that if you actually follow the instructions in the README, you get a seemingly valid result.
They're shell comments so it won't interfere with payload execution.
And lastly, they act as a marker that can be used by a later regex to locate the file _without_ referencing it by name directly nor using the actual Hello and World strings.
For security critical projects, it seems like it would make sense to try to set up the build infrastructure to error (or at least warn!) when binary files are being included in the build. This should be done transitively, so when linux distros attempted to update to this new version of liblzma, the build would fail (or warn) about this new binary dependency.
I don't know how common this practice is in the linux distro builds. Obviously if it's common, it would take a lot of work to clean up to make this possible, even if it's even possible in the first place. It seems like something that would be doable with bazel, but I'm not sure about other build systems.