I am not saying or thinking that there is a problem with GCC. I do think that GCC and Clang would be more useful with an option to make them not assume that every pointer is aligned if the target architecture does not impose this, but that's not the same thing as saying there is something wrong with GCC.
The message of the post, rather than “something is wrong with GCC”, is, “Beware. You might think that this is okay to do in your C programs, but it is not and here is why.”
Also before I post something like this, I need a confirmation that the behavior is intended and not accidental. It has happened to me before that I was about to document that GCC had an agressive behavior with respect to a kind of optimization (while remaining arguably in line with the intent of the standard, even if the word of the standard was in this case ambiguous enough to be interpreted any which way), and my co-author and I had to use a “missed optimization” ticket on GCC's bugzilla in order to have them confirm that GCC was doing the thing in question on purpose. GCC's developers, seeing the bug report, changed the behavior to remove the optimization entirely instead: https://gcc.gnu.org/ml/gcc/2016-11/msg00111.html
Coming back to the example at hand, if I had phrased the ticket as “GCC shouldn't optimize this”, it would have been closed instantly as “well it's UB”. I hoped for a more interesting search for a trade-off that would satisfy everyone, from people who just want legacy C code to keep working with new compilers to people who want programs to run as fast as possible if I phrased it this way.
(And yes, you have to ask in the bugzilla if you need some sort of official answer for this kind of thing. If you ask on a mailing list, you'll get a “no that was UB from the start” answer from someone you
have never heard of who is in fact a power user who subscribed to the mailing list, and whose opinion, while useful, should not be assumed to be that of the compiler developers.)
Do you know if there is a compiler switch that can insert run-time checks any time it makes assumptions which could be invalid (such as “this random pointer is aligned”) and abort with an error message (or something) when it is not true? I think this would be invaluable for tracking down odd bugs caused by things like this.
I have seen it said in another thread that UBSan detects this.
If you aren't already using all the sanitizers that come with your {CLang, GCC} compiler, you should! They are great!
UBSan detects everything that can be detected without metadata. It would be its job to find this, since this is a simple mask to apply and test at each pointer access.
UBSan cannot detect if memory is initialized or if a pointer is valid, because these questions cannot be answered locally, looking only at the instruction doing the access. You need metadata for this. The sanitizers that maintain the metadata to answer these questions are respectively MSan and ASan. Their heavy instrumentations are incompatible, so you can only use one at a time.
I am not saying or thinking that there is a problem with GCC. I do think that GCC and Clang would be more useful with an option to make them not assume that every pointer is aligned if the target architecture does not impose this, but that's not the same thing as saying there is something wrong with GCC.
The message of the post, rather than “something is wrong with GCC”, is, “Beware. You might think that this is okay to do in your C programs, but it is not and here is why.”
Also before I post something like this, I need a confirmation that the behavior is intended and not accidental. It has happened to me before that I was about to document that GCC had an agressive behavior with respect to a kind of optimization (while remaining arguably in line with the intent of the standard, even if the word of the standard was in this case ambiguous enough to be interpreted any which way), and my co-author and I had to use a “missed optimization” ticket on GCC's bugzilla in order to have them confirm that GCC was doing the thing in question on purpose. GCC's developers, seeing the bug report, changed the behavior to remove the optimization entirely instead: https://gcc.gnu.org/ml/gcc/2016-11/msg00111.html
Coming back to the example at hand, if I had phrased the ticket as “GCC shouldn't optimize this”, it would have been closed instantly as “well it's UB”. I hoped for a more interesting search for a trade-off that would satisfy everyone, from people who just want legacy C code to keep working with new compilers to people who want programs to run as fast as possible if I phrased it this way.
(And yes, you have to ask in the bugzilla if you need some sort of official answer for this kind of thing. If you ask on a mailing list, you'll get a “no that was UB from the start” answer from someone you have never heard of who is in fact a power user who subscribed to the mailing list, and whose opinion, while useful, should not be assumed to be that of the compiler developers.)