Is a compilation-test a legitimate/common/typical method to go about this?
Independently, of the breaking code, to me it seems accidental failing, or even accidentally not failing, would be in the nature of such an assessment... So, this commit seems to raise the question of "why?", even if you missed the dot, doesn't it? If a feature is formally available, but effectively broken somehow, wouldn't you want the compiler to complain, instead of the feature dropped silently? Is the reasoning in the code comment sound? Can you test, if syscalls are defined in another way?
> Is a compilation-test a legitimate/common/typical method to go about this?
Yes—in fact, compilation tests are often the only way you can tell if a feature actually works. It's extremely common for C build systems to detect and work around weird systems.
It’s by design. The job of autotools is to find “ground truth” about whatever environment you’re compiling against. It’s meant to discover if you can use a feature by actually seeing whether it works, not just by allow-listing a known set of compiler or library versions. This is because the whole point is to allow porting code to any environment where it’ll work, even on compilers you don’t know about. Think back to a time when there were several dozen Unix vendors, and just as many compilers. You don’t want your build script to report it can’t compile something just because it isn’t aware of your particular Unix vendor… you want it to only fail if the thing it’s trying to do actually doesn’t work. The only way to do this is by just testing if certain code compiles and produces the expected result.
Some of the checks are there to tell whether or not the compiler even supports your code. You may not be able to compile your code at all, and the job of the build system is sometimes to just emit useful errors to help the person building the code to understand that they need a compiler which supports [language feature X].
Again, this is intended to be portable software. It is designed to work on lots of OS’s, with lots of compilers, in a lot of future environments that don’t even exist yet.
If you have a security feature for example, which uses the pledge() syscall on OpenBSD, but you can only use that feature on OpenBSD systems, you have two choices:
- Conditionally compile it based on whether you’ve detected that this is an OpenBSD target at build time, or,
- Conditionally compile it based on whether some sample code that uses pledge() builds successfully.
You can’t defer this decision until runtime, because it would require linking to pledge() symbols even though they may not exist on this system, which would cause the executable to fail to link at runtime, unless you completely rearchitected to use a plugin model, which is overkill.
So given the above are your main two options, the latter is preferred mainly because it allows new systems to come in and be compatible with old ones (maybe someone adds pledge() support to Linux one day) without having to fudge the uname command or something. This was super important in the early Unix days… perhaps less so now, but it’s still a good way to write portable software that still can take advantage of platform-specific features.
> Again, this is intended to be portable software.
A scathing criticism of the OpenSSL library by the BSD team was that it was too portable in a (very real) sense that it wasn't even written in "C" any more, or targeting "libc" as the standard library. It would be more accurate to say that it was "Autotools/C" instead. By rewriting OpenSSL to target an actual full-featured libc, they found dozens of other bugs, including a bunch of memory issues other than the famous Heartbleed bug.
Platform standards like the C++ std library, libc, etc... are supposed to be the interface against which we write software. Giving that up and programming against megabytes of macros and Autotools scripts is basically saying that C isn't a standard at all, but Autotools is.
Then just admit it, and say that you're programming in the Autotools standard framework. Be honest about it, because you'll then see the world in a different way. For example, you'll suddenly understand why it's so hard to get away from Autotools. It's not because "stuff is broken", but because it's the programming language framework you and everyone else is using. It's like a C++ guy lamenting that he needs gcc everywhere and can't go back to a pure C compiler.
Job ads should be saying: "Autotools programmer with 5 years experience" instead of "C programmer". It would be more accurate.
PS: I judge languages by the weight of their build overhead in relation to useful code. I've seen C libraries with two functions that had on the order of 50kb macros to enable them to build and interface with other things.
C basically has no standard library. It's no surprise to anyone who has ever used it more than in passing that you depend on the chosen build system to replace that. Building portable C libraries is very different because of this from any other commonly used programming language - even C++.
It's ironic to me to say that C has no standard library while at the same time libc is one of the most important libraries that most programs on an installed system can't go without.
So it has one, but it's small. It has a few useful functions, for example system(const char*) which was used by the exploit.
Actually the interaction with libc is not like what you expect from a standard library in other languages. Even apart from being very small, it's not really a single library - you have glibc, musl lib c, the BSDs each have their own libc, Mac OS has its own, Windows has its own. And if you want a very portable C program, you can't assume your program will run with the libc on your system, you need to take into account differences between these. Also, writing C programs that don't use the standard library at all is not unheard of - even apart from C in-kernel or on bare metal. For example, on Windows, libc is just a wrapper over win32, and you can just use that directly and gain much more functionality if you are not going to be portable anyway.
Additionally, libc is typically more of a system component than a part of your program. You can't choose to distribute a libc you prefer with your program and use that, you have to link to the system libc on many OSs. Even on Linux where it's not strictly required, if you use a different libc than the distribution provided one, you can end up in all sorts of problems when you interact with other programs.
AzulJDK and Android JDK would be better examples, as Oracle JDK is just an Oracle-blessed build of OpenJDK.
The difference from libc though is that there is no problem in distributing a program with your preferred JDK, and multiple Java programs can live on the same system while each using its own JDK and even communicate risk free with each other.
Also, different JDKs are significantly more similar in the API they offer to Java programs than different libc are - at least for a common core of functionality.
Which validates my point: OpenSSL is not “C”, it’s not even “Perl”, it’s targeting a bespoke framework and build platform that just so happens to be written in Perl.
> If you have a security feature for example, which uses the pledge() syscall on OpenBSD, but you can only use that feature on OpenBSD systems, you have two choices:
Just in case, I want to note that pledge(2) and unveil(2) are also supported by SerenityOS, so checking only for an OpenBSD target is insufficent.
You'll never make it to runtime if you try to include headers that don't exist. You'll never make it to runtime if you try to link in libraries that don't exist.
Runtime syscall number detection is very common in practice, since the kernel returns ENOSYS to enable that exact ability for glibc and other shim libraries
That's syscall for _availability_ detection, not its number. There's no way to ask the kernel "what's the number of the syscall commonly known as read()".
It's not by design, unlike what siblings say. It's by accident (so, "legacy", as you put it).
The problem is that already in the 80s there was tons of variability from on Unix system (or version of it) to the next, but there was no standard way of representing what features/standards/APIs/libraries a system supported or had installed. When faced with such a mess people wrote code to detect features are present on the target host.
This then got made into tools with libraries of detection code. Think autoconf/autotools.
Now we also have pkgconfig, but it's too late and it was not enough anyways.
Some things you might only detect at run-time, provided that their ABIs are stable enough that you can copy their headers into your application.
It’s by “design”, in the sense that C and C++ provide no better way to really know for sure that the functions you want to call really exist. In more modern languages we rely on metadata and semver, but none of that exists for C and C++.
Very true, but don’t forget that Autoconf checks for “interesting” compiler choices as well as library and OS features. And then there is libtool, which abstracts out the differences between how compilers generate shared libraries so that you only have to understand one way of doing it and it will work on all of them.
- instruction set architecture
- OS versions
- ABIs
- libraries (whether they are installed)
- and what functionality they provide
- commands/executables
- anything you can write a macro to check
All stuff too disparate to reliably have the OS be able to answer every question you might have about it and the stuff installed on it. You can't wait for any such system to learn how to answer the questions you might have about it, so some things you can only detect, either at build configuration time, build time, or run time.
Yea, it’s not impossible to do, just not standardized. If you use a library and it provides useful version information, then definitely use it. It’s just that the language or the tooling doesn’t force libraries to have that kind of metadata. Compare that with Rust, where every library you use must come with a standardized manifest that includes a version number, and where they tell library authors up front that they are expected to follow the semver convention.
The fact is that things have good easier over the last 10 or 20 years. It used to be the case that any program targeting Unix had to either spend a lot of time and energy tracking the precise differences between dozens of different commercial Unices, or use autoconf. Autoconf was the project that combined all of that lore into a single place, so that most people didn’t have to know every single detail. But these days most of the Unices are dead and buried, and 99% of all new projects just target Linux. Kinda sucks if you prefer one of the BSDs, or OpenSolaris/Illumos/SmartOS, but it does mean that new Linux developers never have to jump through those hoops and simply never learn about autoconf. And while on the one hand that represents a loss of knowledge and competency for the community (making this type of supply–chain attack much easier), on the other hand autoconf is in practice an abomination. It is (or at least was) extremely useful, but it was implemented in M4 and reading the source code will literally damage your brain.
> It used to be the case that any program targeting Unix had to either spend a lot of time and energy tracking the precise differences between dozens of different commercial Unices, or use autoconf. Autoconf was the project that combined all of that lore into a single place, so that most people didn’t have to know every single detail.
As a data point, the place I worked for in the mid-90s had a single codebase with something over 600 permutations of supported OS and compiler when you included the different versions. One thing we take for granted now is how easy it is to get and install updates – back then you might find that, say, a common API was simply broken on one particular operating system version but your customers would have to wait for a patch to be released, sometimes purchased, put onto floppy disks or CD-ROM, and then manually installed in a way which had enough risk involved that people often put it off as long as they could. Some vendors also did individual patches which could be combined by the sysadmin so you had to test for that specific feature rather than just saying “is it greater than 1.2.3?”, and it wasn’t uncommon to find cases where they’d compiled a common library with some weird patches so you had to test whether the specific features you needed functioned.
Part of why Linux annihilated them was cost but much of it was having package managers designed by grownups - I remember as late as the mid-2000s bricking brand new Sun servers by running the new Solaris updater, which left them in an unbootable state.
Back in the day when people compiled source from tarballs on their personal machines, the autoconf script would query your system to see what functionality it supported. It did this by trying to compile small programs for each feature. If the compilation failed it assumes the feature is unavailable on your system and a flag is set/unset for the rest of the build.
Have you ever tried to manually build something from a release tarball by starting with ./configure? If so, have you observed how many times the compiler is invoked in this configure phase before you even run make?
That "check_c_source_compiles" function should first test if the provided code snipped is "valid C code in general" and only then check if it compiles in given system.
Independently, of the breaking code, to me it seems accidental failing, or even accidentally not failing, would be in the nature of such an assessment... So, this commit seems to raise the question of "why?", even if you missed the dot, doesn't it? If a feature is formally available, but effectively broken somehow, wouldn't you want the compiler to complain, instead of the feature dropped silently? Is the reasoning in the code comment sound? Can you test, if syscalls are defined in another way?