Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Undefined behavior in C is a reading error (yodaiken.com)
203 points by zdw on May 20, 2021 | hide | past | favorite | 491 comments


Because there seems to be some confusion in this thread:

- "Implementation-defined behavior" means that the C standard specifies the allowable behaviors that a C implementation must choose from, and the implementation must document its particular choice.

- "Unspecified behavior" means that the C standard places no particular restrictions on the behavior, but a C implementation must pick a behavior and document its choice.

- "Undefined behavior" means that C implementations are allowed to assume that the respective runtime condition does not ever occur, and for example can generate optimized code based on that assumption. In particular, it is free to not decide any behavior for the condition (let alone document it). As a consequence, if the runtime condition actually does occur, this can affect the behavior of any part of the program, even the behavior of code executed before the condition would occur. This is because from a false assumption the truth of any statement can be logically derived (principle of explosion [0]). And that is why the C standard does not restrict the behavior of the whole program if it contains undefined behavior.

[0] https://en.m.wikipedia.org/wiki/Principle_of_explosion


I would note that the article is explicitly contesting the definition of UB that you are giving here (though you are absolutely right that this is the de facto definition used by all major compilers, and the commtitee).

Basically the article is arguing that UB should be similar to Unspecified behavior - behavior that the implementation leaves up to the hardware and/or OS.

I'm not sure where I fall to this issue, though I would note that the definition of UB in the standard needs quite a bit of interpretation to arrive at the commonly used definition you are quoting. That is, while I think the definition you give is compatible with the one in the standard, I don't think it is the definition of the standard, which is much softer about how UB should impact the semantics of a C program. In particular, nothing in the wording of the standard expliclty says that an implementation is expected to assume UB doesn't happen, or that a standard-conforming program can't have UB.


For me, the canonical UB example is a buffer overflow. No matter how you define UB, in practice a buffer overflow can result in for example, a system crash or - given appropriate very specific input data - encrypting all the files on your hard drive for a ransom.

Requiring compilers to restrict UB to something similar to unspecified behavior (where the behavior is not specified by the standard, but a C implementation must pick a behavior and document its choice) would require C compilers to prevent arbitrary code execution given a buffer overflow in C code, i.e. ensure that a buffer overflow in C code does not ever result in actual buffer overflow on the physical machine. This seems implausible to achieve - given the tradeoffs chosen for C, once UB hits, it really is undefined as it can (not in all, but in some) scenarios result in executing absolutely arbitrary machine code depending on the data provided to the program.

On the other hand, if you'd just want to say that the compiler can't assume that UB won't happen for optimizations, then many optimizations are impossible because theoretical UB is literally everywhere where basic arithmetic happens because of the possibility of an integer overflow - if UB would be required to do "whatever the hardware does" then it means that you can't even reorder basic chains of arithmetic instructions and have to execute every operation in order as-written to have the appropriate behavior in case of overflow.


It goes further than that. The debates around UB a more about whether the compiler can assume that there are no buffer overflows and perform optimizations based on that. For example, if you have a local variable `char buffer[16]`, and there is an access `buffer[i]`, should the compiler be allowed to derive that 0 <= i < 16, and perform optimizations on other uses of `i` based on that "knowledge"? If all bets are off for `buffer[i]` when i < 0 or i >= 16, why shouldn't it? But some argue that the compiler shouldn't, exactly because such automated formal reasoning can lead to arbitrarily wide reaching effects. The problem is that it's hard to see where a middle ground could be.


If the array index is unconditional, then sure that seems like a reasonable undefined behavior? If the access is conditional and the compiler can't infer whether the branch is taken, though, then it shouldn't make that assumption.

This reminds me of a bug in the msp430 variant of gcc. The compiler would replace "x <= literal" with "x < literal + 1" because it generated more efficient code. If the literal was UINT_MAX, though, the compiler would trudge ahead and roll over the literal to 0. It would then subsequently reason that the comparison was unconditionally false, and then completely optimize away the nominal path of my code. That was a frustrating day of debugging!


The result is a perpetually unstable situation fostering inevitable errors when the programmer is surprised by the compiler. There's no remedy except to move away from C to more tightly specified languages.


The problem is ecosystems like UNIX based platforms, that will never move to something else.

The only solution is to throw them away, e.g. Android style.


Android threw away a unix based platform? News to me.


It did, ask termux guys.

The Linux kernel is an implementation detail.

There is nothing UNIX related exposed to userspace.

Java and Kotlin applications, while NDK APIs are clearly defined,

https://developer.android.com/ndk/guides/stable_apis

Or Treble Project for that matter,

https://source.android.com/devices/architecture/hidl

Try to use private APIs (which is what the Linux underpinnings are) and the application might be blessed with being killed by the OS.


What I never see mentioned in this type of discussion is what this compiler trick buys us. Like how much faster does my program run because of deductions like these? And are there any patterns that particularly benefit from it? Could they hint the constraint manually?


There are plenty of examples in this thread alone.

In my opinion hinting won't help. Everyone who cares about the performance gain will enable the hint (which I expect will be almost everyone) so you have won nothing but added noise with the hint.


There is not a single compelling example. Just contrived examples with dubious speedups and obvious errors.


The trouble here is porting the maximalist disaster of one scenario to another scenario simply because the same phrase is used to describe them.

Because a buffer overflow may lead to arbitrary control flow escape via ROP or other hijacking of the instruction pointer, it does not follow that e.g. signed overflow is similarly dangerous.

Signed overflow, if not guarded against, may enable buffer overflow, but the deeper irony is that detecting when signed overflow has already occurred is close to impossible if the compiler is also aware that the check implies that it has occurred.

Thus, code designed for safety, guarding against buffer overflows by detecting signed overflow, ends up not protecting against buffer overflow, simply because signed overflow is described using the same name as buffer overflow.


Yes, now if the proposal would be not to redefine the semantics of UB (as the post to which I replied suggested) but rather about specifying that buffer overflows are UB1 and integer overflows are UB2, and UB2 means something different from UB1 (perhaps identical to implementation-specified behavior, perhaps something different), then that could be reasonable.


> given appropriate very specific input data

That's the operative words there. Compilers are not required to ensure that, in sufficiently perverse circumstances, undefined behaviour never results in demons flying out of you nose. They are, however, required to not actively put said demons there themselves, because that's part of what distinguishes a programming language implementation from a piece of malware masquerading as a programming language implementation.


Compilers never actively try to put demons in your program. However, they do occasionally end up making constructs that result from a reasonable (if not the most well-thought-out) chain of decisions that end up looking like demons to the untrained eye.


I didn't say "try"; if it quacks like demon, it's a demon, and the implementation is (whether the standards commitee likes it or not) required not to actively put it into programs that didn't already have it[0], deliberately or otherwise.

0: like this one:

  int* zero = 0;
  int bad = *zero; // maybe crash lol
  if(!zero) abort(); // definitely crash lol
  printf("Uln nasaloth geb hai!\n");


It just isn't that simple in practice. Modern compilers automatically apply a plethora of different rules in sequence when transforming code. This results in a long chain of transformations where each operation is perfectly reasonable when considered on its own. Unfortunately, for certain inputs, the sum total occasionally appears demonic.

The current state of the art is such that there is no way to reliably prevent this that doesn't also severely reduce the optimization abilities of modern compilers. Realistically, adding such language to the standard would (at best) result in worse optimizations by default and additional compiler flags to reenable the more aggressive "nonstandard" ones. (Such flags already exist for various arithmetic operations.)


> where each operation is perfectly reasonable when considered on its own.

No, it is not. For example, the transformation:

  int read_and_discard = *p;
  // vvvv
  int read_and_discard = *p;
  __unsafe_assume_always(p != 0);
is not reasonable, since p is not, in fact, always nonnull.


Your examples are far too simplistic. Real world code is going to be far more complex and will tend to resist trivial analysis.

For example, please explain how to prevent this without also (inadvertently) preventing the removal of unnecessary null checks when functions are inlined. What about an unnecessary null check that's hidden inside a macro? What about whole program LTO?

A macro could be used in a variety of situations. In some of them, you need the null check for safety. In others, the null check in entirely redundant. Short of AGI, the compiler can't actually comprehend the code it operates on. Yet surely you expect it to eliminate "obviously" redundant work? It accomplishes this by applying a large set of fairly simple rules.

(There are probably much better examples but I can't think of them off the top of my head.)

In practice, when I play with Godbolt I find that the "obvious" cases tend to result in helpful diagnostic messages.


> For example, please explain how to prevent this without also (inadvertently) preventing the removal of unnecessary null checks when functions are inlined. What about an unnecessary null check that's hidden inside a macro?

If a null check is unnecessary, that's because it is reachable only from the not-null side of some previous null check. The compiler can track that information just fine, using the same tools it uses to track false information derived pointer dereferences.

If:

  if(!p) abort();
  // we know p is non-null here, regardless of the dereference
  use(*p);
  MACRO(p);
expands to:

  if(!p) abort(); // first null check (this is relevant to optimizing out unnecessary null checks)
  use(*p); // pointer dereference (this isn't)
  if(!p) abort(); // second null check (unnecessary *because of the if*)
  utilize(*p); // more pointer dereference
the compiler can optimize out the second if (the "unnecessary null check" you refer to) based on the first if, regardless of whether the pointer dereference is even there.


This is still too simplistic. Think along the lines of:

    void foo(T *p) {
      if (!p) abort();
      bar(p);
    }

    void bar(T *p) {
      use(*p);
      MACRO(p);
    }
In other words, the first null-check and the latter check are in different functions that may not even be in the same compilation unit, or where the call sequence is hard to reason about due to function pointers etc.


Assuming bar isn't inlined (eg separate compilation unit, as you mentioned), the null check in MACRO is not unnecessary, because bar can be called from places other than foo, and those places might pass null pointer.

This is one of several situations where it might be useful for the compiler to excercise it's perogative to implement operations without regard for undefined behaviour to rewrite `use(*p);` as `if(!p) abort(); use(*p);`, which would make the null check in MACRO unnecessary, but unless it does so, the check is not unnecessary, just insufficient.


I disagree. The transformation should be valid; if p is null then the result is undefined and may crash, including the rest of the program after that occurs. However, I do think that it should not make such a transformation for a volatile read/write; in that case it should not make such an assumption (who knows if it is meaningful on the target computer or some sort of emulator or operating system or debugger or whatever).


> In particular, nothing in the wording of the standard expliclty says that an implementation is expected to assume UB doesn't happen, or that a standard-conforming program can't have UB.

An implementation isn't expected to assume that UB doesn't occur, but it is allowed to assume that.

With regard to programs, the C standard has two different notions of conformance (cf. chapter 4 Conformance). There are strictly conforming programs, which may not rely on anything that would depend on the specifics of a particular C implementation, which of course includes UB. Strictly conforming programs are thus guaranteed to have the same behavior on all conforming C implementations. Then there is the larger class of conforming programs, which is defined as programs acceptable to a (particular) conforming C implementation.

Strictly conforming C programs are severly limited in what they can do. It has been argued that there are hardly any useful strictly conforming C progams.

Non-strictly conforming programs are conforming with respect to a specific C implementation. It is then the job of the C implementation to define what programs it accepts beyond strictly conforming ones, and how it handles (or doesn't handle) undefined behavior. Everything is possible here, from the infamous DeathStation 9000 (with the most insidious UB behavior) to a fully deterministic C implementation that processes any program in the most unsurprising developer-friendly fashion.

There is arguabley a misconception that C is a single language. It is effectively rather a family of languages, for which the C standard only defines a common denominator.


I despair at writing C code where the compiler won't silently surprise me. How are we supposed to learn all these subtle rules and intuit when to apply them without making any mistakes?


By learning only the most important rules, and then enable the compilers undefined behaviour sanitizer(1) during development, making mistakes, and fixing them.

(1)https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#...


> and the commtitee

I'd argue that for a document like the C standard, if there's a well-known intended meaning, that is the meaning of the document - any other interpretation is purely academic.


To some extent, though the committee has not moved to make this definition explicit in the standard (UB = behavior the compiler is free to assume can't happen), for one reason or another.


For the same reason that it came into the standard in first place.

No compiler vendor wanted to give up their own C variant "features".


These are all interpretations. Your trying to special privilege one interpretation by calling it the intended meaning when really it’s what I’d call an authoritative interpretation. The accepted interpretation by an authority. That doesn’t necessarily make it the actual intended meaning though.


I was under the impression the "committee" here was the people who wrote the C standard? So the accepted interpretation by the people who wrote the document must be the intended meaning, no? If it wasn't, they'd have written something else.

(Sure, it's possible they wrote something, then only later came to understand the implications of this. But the standard has been understood as allowing nasal demons since long before the latest revision of the C standard, so I'd argue that by publishing a new version without rewording that section, they're making clear that they intend its current interpretation to be part of the standard.)


It sounds like you're arguing that the authors' interpretation isn't an authoritative interpretation.

Is this what you meant?


from ISO/IEC 9899:2011 "Programming Languages -- C"

    3.4.3

    1 undefined behavior behavior, upon use of a nonportable or erroneous program   construct or of erroneous data, for which this International Standard imposes no requirements

    2 NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
It doesn't look like a leap to go from this definition to that of ignoring the situation completely with unpredictable results.


It's not a huge leap, but it still doesn't mean that UB by definition means that the compiler is allowed to assume UB doesn't happen. Allowing the compiler to assume this is one reasonable result of this definition (I give an example of how this reasoning works somewhere else), but it is not equivalent to the definition of UB, in principle.


The definition logically implies that compilers are allowed to assume UB doesn't happen. The definition is:

> undefined behavior: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.

With these optimisations, either:

1. The program contains no UB, the transformed code acts as expected, and the compilation is valid.

2. There's UB; the standard imposes no requirements on what behaviour that translates to; and so the transformed code is valid, regardless of what it does.


It implies it, but they are not equivalent. Basically the common interpretation of UB is sufficient, but not necessary, given the standard.


This quote is the topic of the original article and the article goes into detail about how it believes the quote should be interpreted.


...and yet the article completely ignores the "with unpredictable results" part and instead spends a lot of time discussing all the other valid consequences (which are also only mentioned as examples, at least in the common understanding of "from ... to ...").

Downthread commenters go into more detail regarding the "ignoring e.g. the possibility of signed overflow may mean to assume that it never happens" reading, so I won't elaborate on it here.


Leaving overflow to the processor is an example of ignoring it with unpredictable results. Deleting overflow checks because you assume, incorrectly, that overflow is impossible is not an example of ignoring with unpredictable effects, it does produce unpredictable effects, though.


How is the compiler supposed to know that a particular operation is intended as an overflow check though? It isn't a human and it doesn't actually comprehend the code it operates on. It just blindly applies rules.

I want the compiler to eliminate redundant operations. That's a large part of the point of doing optimizations in my view! Best effort attempts to avoid eliminating obvious sanity checks are desired of course, but I doubt it's feasible to reliably identify those short of AGI. (And at that point, why are you still writing code?)


You are advocating incorrect code that uses a few less machine operations than correct code.


The compiler should have valid rules, not invalid ones.


The original article's interpretation seemed untenable.

While the difference between "Permissible" and "Possible" could be quite significant, in this case, it was qualifying:

> [Permissible/Possible] undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

The previously-"Permissible" behaviors were so broad that they basically allowed anything, including translating the source-code in any documented manner.. which basically means that, as long as a compiler says how it'll treat undefined-behavior, it can do it that way, because it's free to completely reinterpret the source-code in any (documented) manner.


To me ignore the situation completely with unpredictable results would mean: the compiler generates an assembly that could not be correct, and then the behaviour of the program is determined by what the processor does.

Doing something like removing checks is not ignoring the situation: is acting in some particular way when undefined behaviour is detected.

And it has neither unpredictable results: it specifies what happens, since these checks are added systematically.

I don't see anywhere in the standard that the compilers are free to change at their choice the semantic of the program if undefined behaviour is detected. Rather undefined means to me that the compiler generates code where the result cannot be known because it will depend on external factors (basically the hardware implementation).


This is untenable, because different compilers will generate different sequences of instructions which will misbehave in different ways. For example, one compiler may choose to reorder the actual memory access until much later, to a branch of the code that doesn't execute in some cases, so "the hardware implementation" could vary from "nothing happens" to "consistent SIGSEGV".


Compilers don't "detect undefined behaviour": they assume no undefined behaviour is present. It is not possible to change the semantics of the program that contains undefined behaviour because undefined behaviour means there are no valid semantics to begin with.

This is exactly the same situation as dividing by zero in mathematics. If your proof relies on division by zero, you can always prove anything true (the classic 1 == 0 proof for example).


> - "Undefined behavior" means that C implementations are allowed to assume that the respective runtime condition does not ever occur, and for example can generate optimized code based on that assumption.

Please note that the article is making the specific argument that this interpretation of UB is an incorrect interpretation. The author is arguing that you, me, the llvm and gcc teams are wrong to interpret UB that way.

Linux had a bug in it a few years ago; the code would dereference a pointer, then check if it was null, then returned an error state if it was null, or continued performing the important part of the function. The compiler deduced that if the pointer had been null when it was dereferenced, that's UB, so the null check was unnecessary, and optimized the null check out. The trouble was that in that context, a null pointer dereference didn't trap, (because it was kernel code? not sure.) so the bug was not detected. It ended up being an exploitable security vulnerability in the kernel, I think a local privilege escalation.

The article is making the argument that the compiler should not be free to optimize out the null check before subsequent dereferences. The compiler is permitted to summon nasal demons where the pointer is dereferenced the first time, but should not be free to summon nasal demons at later lines of code, after the no-nasal-demons-please check.

(The linux kernel now uses -fno-delete-null-pointer-checks to ensure that doesn't happen again. The idea is that even though it was a bug that UB was invoked, the failure behavior should be safe instead of granting root privileges to an unprivileged user.)

Fun with NULL pointers part 1 https://lwn.net/Articles/342330/

Fun with NULL pointers part 2 https://lwn.net/Articles/342420/


> The trouble was that in that context, a null pointer dereference didn't trap, (because it was kernel code? not sure.) so the bug was not detected.

Yes, because it was kernel code. Because that dereference is completely legal in kernel code. The C code was fine, assuming that it was compiled with appropriate kernel flags. This was not a bug in Linux, at least not on the level of the C code itself.

> The linux kernel now uses -fno-delete-null-pointer-checks to ensure that doesn't happen again.

I also seem to remember that it was already using other "please compile this as kernel code" flags that should have implied "no-delete-null-pointer-checks" behavior, and that the lack of this implication was considered a bug in GCC and fixed.


By the way, dereferencing NULL is a well defined behaviour on every computer architecture: you are basically reading at address 0 of memory. It just causes a crash if you have an operating system since it will cause a page fault, but in kernel mode or in devices without an OS is a legit thing to do (and even useful in some cases).

Why should C compilers make it undefined? The standard doesn't mandate that undefined behaviour should change the semantic of the program. Just define all the undefined behaviour that you could, to me keeping them undefined makes no sense (even from the standard point, everyone knows that if you overflow an int it wraps around, why should it be undefined??)


NULL is not required to have a bit representation of all zeroes. If you are programming for a low-level hardware device, it might be worth your while to get a C implementation that does not represent the NULL pointer this way.


Pointers are hardly even required to have a bit representation at all! [1]

This is one of the most common impedance mismatches between programmers and the C spec. C does not mention how the machine should handle memory. Variables and pointers are just abstract constructs there. While many programmers think their programs are a giant char* on their DRAM stick that can be fiddled with at any time and in any way they please.

([1] Actually this is not entirely true but it is better to think of them this way for the sake of not making more assumptions on the memory which are UB. Pointers are allowed to be converted to integer types - but with the caveat that alot of behaviour around it is implementation defined and of course surrounded with a big dose of UB as well!)


I’m not totally sure, but I think dereferencing a zero pointer is theoretically not undefined behavior in C, just so long as you didn’t obtain the zero pointer by initializing a pointer with a null pointer constant.


I noticed that I mixed up "implementation-defined behavior" and "unspecified behavior" a bit. Here are the actual definitions:

implementation-defined behavior: unspecified behavior where each implementation documents how the choice is made

unspecified behavior: use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance


Yeah, I was about to post about this. Note that implementation-defined behavior and unspecified behavior are not _only_ used in places where a choice is given. Quite a few uses of implementation-defined behavior do not actually select between options. It just says the behavior is implementation-defined. So I suppose it selects between infinite options.

For this kind of behavior there is actually no limit on what the implementation is allowed to do, including assuming it never happens and optimizing accordingly. Implementations just need to document what they do.

I don't think these attempts to change the definition of UB are useful. As a compiler vendor it doesn't help to just say "it has a behavior", because that doesn't stop me from doing exactly what I do today. If people want some specific behavior, or to limit behaviors, then they need to actually say that in the spec.

To take left shift for example. Instead of saying it's implementation-defined behavior, they should say that it produces a implementation-defined non-trap value in the range of the resulting type that is consistent for the same inputs.

This would be pretty short to write in standardees, doesn't allow UB based optimizations, and allows the required implementation divergence.


To my knowledge, in case of unspecified behaviour, it's not required to pick and document a particular choice. Behaviour may even vary on a case by case basis where it makes sense (eg order of evaluation of function arguments, whether an inline or external definition of an inline function gets used, etc).

The need to properly document the choice is the defining characteristic of implementation-defined behaviour.


First: I do dislike how hard it is to avoid some UB / how impractical some of the rules are.

But I also think a lot of discussions of this topic caricaturize compiler writers to a ridiculous degree. Almost describing them to write optimization passes looking for UB so they can over-optimize something, while cackling loudly in glee about all the programs they can break.

The set of people doing so overlaps with the set of people complaining that the compiler doesn't optimize their code sufficiently to a significant degree.

Lots of compiler optimizations need to know the ranges of values, hence logic to infer ranges. One of the sources for that is "can't happen" style logic - which nearly all of the time are things the code author would agree with if they thought long and hard. Not just about the code as written, but also good the code looks like after inlining (across TUs with LTO).


I agree.

I don't have much sympathy for people who were doing things like writing multithreaded programs in the days before C documented its memory model and then becoming unhappy because new optimisations that legitimately help single-threaded code broke their programs.

In my experience C compiler maintainers have generally been open to the idea of offering guarantees beyond a narrow reading of the standard, but they want to be able to clearly state what it is that they're guaranteeing. "Keep old programs running" isn't enough.

I think the "Prevailing interpretation" that Yodaiken complains about is coming from the same place as suspicion of the "be lenient in what you accept" IETF principle: that sort of thing doesn't lead to robustness in the long run.

The way forward at this point is surely to define more things that are currently undefined (whether in the standard or by widely-implemented extensions).


> the same place as suspicion of the "be lenient in what you accept" IETF principle

No. It's fine (arguably desireable, but reasonable people might disagree) for implementations that encounter undefined behaviour to terminate execution immediately, especially if it's with a error message. The problem is when implementations silently willfully misinterpret what they accept, particularly in ways that cause (not "expose"; reading from a null pointer and discarding the value (for example) isn't) security vulnerablities.


> But I also think a lot of discussions of this topic caricaturize compiler writers to a ridiculous degree.

The inflamed backlash should tell you just how damaging it is to impose silent failure on meticulously written, previously fine programs.


> meticulously written, previously fine programs

With relatively few exceptions, if your program hits undefined behavior, then your program was already doing something pretty wrong to begin with. Signed overflow is a poignant example: in how many contexts is INT_MAX + 1 overflowing to INT_MIN actually sane semantics? Unless you're immediately attempting to check the result to see if it overflowed (which is extremely rare in the code I see), this overflow is almost certain to be unexpected, and a program which would have permitted this was not "previously fine" nor "meticulously written."

I feel compelled right now to point out that software development is a field where it is routine to tell users that it's their fault for expecting our products to work (that's what the big all-caps block of every software license and EULA says, translated into simple English).


Triggering signed overflow and testing whether it has occurred was how I protected the Delphi RTL memory allocation routines from signed overflow attacks, attacks which if not prevented, lead fairly directly to buffer overruns.

The code was written in Pascal and assembly, though, so it was safe from a C compiler.


I am quite thankfull that I learned systems programing via BASIC, Pascal and Assembly before coming to C.

As such my vision of low level coding isn't tainted by the ways of C.


> in how many contexts is INT_MAX + 1 overflowing to INT_MIN actually sane semantics? Unless you're immediately attempting to check the result to see if it overflowed (which is extremely rare in the code I see)

It's not that rare - I know that postgres got bit by particularly that issue, and several other other projects as well. Particularly painful because that obviously can cause security issues.


Isn't checking for overflow depending on UB? i.e. in

    int a;
    // lots of code
    int b = a + 1;
    // check for overflow
    if (b <= a)
        abort();
the compiler is allowed to remove the check because in the absence of UB a + 1 > a, therefore the conditional is always false.


As brilliantly said further in this thread by someone else, maybe check that something is OK before doing it.

In that case, something like: [...] if (a > INT_MAX - 1) abort(); int b = a + 1; [...]


> As brilliantly said further in this thread by someone else, maybe check that something is OK before doing it.

Much easier said than done:

https://github.com/postgres/postgres/blob/master/src/include...


Good point that checking for overflow of a multiplication beforehand is hard. But I don't think checking after the fact would be much easier. So thus a C modification to remove the undefined behavior probably wouldn't be very useful still.


The thing is that it's necessary, and (x86, not sure about arm) asm tells you via the d register if not the overflow flag. So why doesn't the C standard arguably provide a misuse-resistant way of doing so? Instead every project having to reinvent it and get it wrong if they do the "obvious" (but wrong) thing.


What would the misuse-resistant way be? Would you want the multiplication operator (*) to return 2 values: the number and a bool telling you whether it overflowed?

Yeah, a standard library function that does this would be good. But many people would just use * instead of this function, and so the problem would partially remain.


mul_would_overflow(), mul_wrap(), mul_saturate(), mul_do_whatever_the_hardware_does_just_please_dont_break_my_code_by_assuming_ub. Can't help people who just use * and don't care about overflow without breaking compatibility, but at least the people who do care won't have to reinvent safe arithmetic.


Yeah, I agree with you. But it would be hard due to there being so many integer types. A binary operation needs to consider 3 types: the types of each argument and the type of the result.

In c++ it would be a little simpler due to templates, so the types of the arguments to the function can be derived. But the type of the result can still confuse programmers. Although maybe it's not so bad, because an overflow that happens during multiplication (or other math operations) is undefined-behavior, but an overflow that happens during assignment of the multiplication result to a variable that can't hold it is only implementation-defined behavior, not undefined behavior.


> With relatively few exceptions, if your program hits undefined behavior, then your program was already doing something pretty wrong to begin with.

The author claims: “We have the absurd situation that C, specifically constructed to write the UNIX kernel, cannot be used to write operating systems. In fact, Linux and other operating systems are written in an unstable dialect of C that is produced by using a number of special flags that turn off compiler transformations based on undefined behavior (with no guarantees about future “optimizations”). The Postgres database also needs some of these flags as does the libsodium encryption library and even the machine learning tensor-flow package.”

So basically the programers of the most used C programs consider the C standard so broken that they force the compiler to deviate from it. (Or they are not able to do things right?)


Somewhat similar to sql situation. (difference being, i am not sure if anybody implements ansi-sql fully)

Standard is just common denominator agreed by actual competitors. (Standard takes into account the chipset where incrementing max_int produces a beep instead of min_int). If user wants to use more advantages of the product (dbms/compiler/hardware), one must sacrifice portability and use cnonstandard extensions.


> extremely rare in the code I see

Well, I learned of the change in compiler behavior some years back because I had written loop code with a sanity check which depended on signed integer overflow wrapping, along with a test case to prove that the sanity check worked, and that test case started failing:

   not ok 2 - catch overflow in token position calculation

   Failed test 'catch overflow in token position calculation'
   at t/152-inversion.t line 70.
To the extent I can be, I'm done with C. I leave it to people who think that silently optimizing away previously functional sanity checks is an acceptable engineering tradeoff, and who disparage those of us who have been bitten.


> people who think that silently optimizing away previously functional sanity checks is an acceptable engineering tradeoff

To be fair, as several people and TFA have pointed out, this isn't a problem with C, but with defective/malicous C compilers. Admittedly, that's not much help if you can't find a compiler that isn't defective/malicous, though, so I can only wish you the best of luck.


I don't get what you or the comment you're responding to are wishing for.

Do you want compilers to stop adding optimizations while staying within the bounds defined by the spec? That they somehow guess that a given piece of code that may trigger UB is too important for them to optimize it based on the assumption that the developer knew what she was doing and ensured that it wouldn't?

The case of a compiler update breaking the test sounds like desirable to me. It pinpoints a critical piece of code that was relying on a specific implementation's behavior that is not specified. This could have been triggered by a compiler or architecture change. If this is something that is a hassle to fix immediately, you can temporarily downgrade to the version of the compiler used earlier or disable some optimizations.

I fail to see why one would consider a compiler evolving while conforming to language specification defective or malicious (however, with that definition I fear that finding one that isn't may indeed be difficult).


I want default behavior which does not surprise me. I have come to understand that this probably means I want a different language, because the C spec essentially requires compiler authors to make optimizations which introduce surprising semantics in order to compete on performance with other languages.

It would be fine if I could opt into new semantics — something like Rust's "editions" would resolve my objections about these compiler optimizations. But that doesn't seem to be on offer in the C ecosystem.


Only moving away to other languages will do it.

C culture and to certain extent Objective-C and C++ ones are tainted by microptimizaitons while typing, where the compilers are the worst examples.

Unfortunely UNIX and C go together, so those that want to keep UNIX like platforms around bettter fix C somehow.


That sucks. I can imagine next generation operating systems improving on Unix technically (standing on the shoulders of giants, yo), but establishing standards a la POSIX will be extremely difficult. It's hard to fight against the interest of platform vendors to lock their users in.


Microsoft fought for several years against C, trying to migrate everyone to C++ as the future of Windows systems programming.

Yet Azure Sphere, despite its security message, uses C only SDK.

Meanwhile Visual Studio now supports C11 and C17.

Market pressure, their customers weren't willing to buy into it, and the new Microsoft also wants all those POSIX FOSS packages written in C running on Windows.


I think that's unfair. The taint is 100% on the C++ side of things.

Hopefully the newer compiled languages will kill C++.


Nope, the tait lies 100% on the copy-paste compatibilty with a C subset.


All you've done here is show that with C the concept of the l-user is still around.


> I don't get what you or the comment you're responding to are wishing for.

Quoting from another of my comments:

> > [What are you objecting to?]

> Inferring any propositional statement about the program (eg "this pointer is not null") from the fact that its negation would imply undefined behaviour.

That is what the problem is. Undefined behaviour is a licence to implement operations without regard for unusual corner cases, not to infer the absence of said corner cases from those operations and then apply that 'knowledge' elsewhere.

> I fail to see why one would consider a compiler evolving while conforming to language specification defective or malicious

It's https://en.wikipedia.org/wiki/Malicious_compliance in near-textbook form.


> Inferring any propositional statement about the program (eg "this pointer is not null") from the fact that its negation would imply undefined behaviour.

I'm not completely sure I understand that correctly, but do you mean that statements that are constant unless considering a possible (and "credible"?) implementation of UB shouldn't be fair-game for the compiler to optimize out? EDIT: I think I see a more tricky case that may be one of those you're referring to. Dereferencing a pointer further in the code shouldn't be a valid justification for optimizing out previous tests of it being null. I can relate with that but I suspect that it would prevent many classes of branch pruning.

I get what you suggest while describing malicious compliance but I can imagine that it could be a false impression resulting from trade-offs that favor optimization opportunities to "out-of-spec but canonical/natural" implementations.


> Dereferencing a pointer further in the code shouldn't be a valid justification for optimizing out previous tests of it being null.

Not further. Anywhere. If the implementation wishes to rewrite pointer dereferences from `use(*p)` to:

  if(!p) abort();
  use(*p);
it may do so (undefined behaviour!), but if it chooses not to do so, it may not later pretend that it did, and remove a explict `if(!p)` that the programmer wrote. Given:

  use(*p);
  if(!p) return NOPE;
this is fine:

  if(!p) abort();
  use(*p);
  //if(!p) return NOPE; // unreachable because of if, not because of use
but not:

  use(*p);
  //if(!p) return NOPE; // CVE-20XX-XXXXX
The implementation can optimize based on information (like "p is not null") that is actually true (even if that's because it made it true), but not based on information it assumed was true on the basis that it counterfactually could have made it true (but didn't).

> it would prevent many classes of branch pruning.

Yes, that's the general idea.


What is the meaning of use(*p) in case p is null? what should the compiler emit?


> What is the meaning of use(*p) in case p is null?

It dereferences a null pointer, invoking undefined behaviour, then calls the function `use` with the resulting value.

> what should the compiler emit?

Probably something to the effect of:

  ld r0 [sp+.p]  # if p is not already in a register
  ld r0 [r0]  # *p
  jsr use
but it would be fine to emit something like:

  ld r0 [sp+.p]  # if p is not already in a register
  jz r0 .panic
  ld r0 [r0]  # *p
  jsr use
because the jz can only be taken when undefined behaviour happens.


If p is statically null, it should emit compile time error. If not, it should emit machine instruction to deference p.


In all of the examples above it will emit a machine instruction to dereference p. What your grandparent is complaining about is that it will later remove an "if (!p)" test.


> I fail to see why one would consider a compiler evolving while conforming to language specification defective or malicious

Conforming to the spec is not a virtue. We want the compiler to be reasonable, regardless of whether the spec is. When the spec is malicious, conforming to the spec is malicious behavior.

For example, the Java spec says that the << operator (bit shift left) will accept any right operand, but performs a modulo-32 operation on the right operand before doing the shift. So `a << 4` is `a` shifted 4 places left, `a << 20` is `a` shifted 20 places left, and `a << 36` is `a` shifted 4 places left again.

This is absurd, and I'm comfortable calling it a bug in the spec. `a << 40` needs to have 0 in the lowest 40 bits. It does not need to have random values in bits 8-31.

This behavior is documented, but that doesn't make things better, it makes them worse.

But the philosophy that says "if it's documented, then it's OK" doesn't even allow for the concept of a bug in the spec.


I recall a bug-report discussion that I sadly have never been able to find. It contains a pretty bad side-effect of this.

It had code like:

    int *p;
    // lots of code
    
    if (p != NULL)
        return 1;
    // use p
Then a later refactor wrongly added a single line before the if statement:

    int *p;
    // lots of code
    
    int a = *p;
    if (p != NULL)
        return 1;
    // use p
This meant the null check was optimized away, since de referencing a null pointer is undefined behavior, so the if-statement can be assumed to be always false. This then lead to actual errors (perhaps even an exploit, I do not recall) arising from the removed null-check.

I think in general the "sanity check" cases are the worst. It is hard to determine whether an expression causes undefined behavior if you cannot try and evaluate it. Perhaps a compiler intrinsic that checks (at runtime) whether an expression causes undefined behavior could be useful here. Though I can imagine such an intrinsic being essentially impossible to implement.


This was in the Linux kernel, which is compiled with special kernel flags which make dereferencing null pointers legal. In the context of that code, dereferencing a pointer and later checking it for null was absoutely meaningful. Optimizing the later null check was a compiler bug that was acknowledged and fixed. The compiler here didn't respect the semantics it had promised to kernel code. This was entirely uncontroversial; it was not a case of "unwanted optimization based on undefined behavior", it was a case of "compiler bug breaking well-defined code". Again, all this in a kernel context.

In user code GCC will still happily remove the null check because in user code this is an actual bug in the user's code.


Actually, there's a even worse version:

  struct foo { ...; bar_t bar[NBAR]; };
  struct foo* p = ...;
  bar_t* q = &p->bar[0]; // add rq, rp, #foo_bar_offs
  // other declarations
  
  if(!p) return NOPE; // optimized out
  // use p and q
Not even any dereferencing, just pointer arithmetic.


Could you post a complete example please and tell us which compiler optimizes out this check? Clang and GCC don't: https://gcc.godbolt.org/z/dbTadEcra


Yes, that one is especially nasty as &p->bar[0] is a constant expression completely solvable at compile time. It's equivalent to the offsetof() macro of stddef.h



If the program was previously "fine" on version x.y.z of some compiler, then it is most likely still fine on it. That's the target that the program was written for.

There's some disagreement on whether you can call a program "fine" that breaks after switching to a newer version, or a different compiler.

I see a lot of programmers out there that unfortunately use the behavior of their code on whatever compiler they're using at the moment as a proxy for what the language actually guarantees.


Well, I can imagine a program having something where new C comments ( // ) makes it not divide, like:

a = 5 //* junk here */ 2 ;

I'm sure there are ioccc or underhanded C contest entries doing this to make code work differently on compilers based on if // is starting a comment line or not.

Sure it is an ugly way of writing stuff and you'd be hard pressed to find lots of real world traps like this, but when/if you did have code that "suddenly" miscompiles you might actually think your old code with an old compiler did work, and a new compiler for "the same" language breaks your program. I don't think everyone code base should need full rewrites ever time a new compiler comes out.


The current situation is not good for compiler writers either. But nobody has ever shown that either C programmers want to sacrifice safety for "optimizations", or that these UB optimizations actually improve performance of anything.


What do you mean by "these UB optimizations"? C is a low-level language; it's basically impossible for a compiler to reason about the code unless it makes certain assumptions. It needs to assume the code is not self-modifying to do pretty much any code-generation more intelligent than a macro assembler. It needs to assume the code isn't messing with the stack frames/return addresses in order to inline functions. It needs to assume the code isn't using an out-of-bounds pointer to access a neighboring local variable, so that it can move local variables into registers. "gcc -O0" is a good approximation for the performance you get if the compiler isn't allowed to optimize based on UB.

Yes, that means C without optimizing without UB is slower than Java. Optimizations need some form of reasoning about what's happening. For Java it's optimizing based on guarantees provided by the language (there's no raw pointers that could mess with the things listed above). But C doesn't provide any hard guarantees, so instead it needs to blindly assume that the code will behave sanely.

Also note that for many of the more manageable sources of UB, most compilers provide a choice (-fwrapv, -fno-strict-aliasing, ...). Yet few projects use these options, even when they use other gcc/clang-specific features. Doesn't that indicate that C programmers indeed want to sacrifice safety for optimizations?


Exactly.

For example there were programs 30-40 years ago that relied on exact stack layouts. These days everybody would agree they are completely broken.

The issue of course is that it is extremely hard to write programs that have no UB. It would be nice for compilers to have an option to automatically introduce assetions whenever they rely on some UB-derived axiom, basically as a sort of lightweight sanitizer.

In fact if we had sanitizers 30-40 years ago probably things would be better today.


> It would be nice for compilers to have an option to automatically introduce assetions whenever they rely on some UB-derived axiom

Modifying a value from a different thread without synchronization is UB. The compiler assumes this does not happen in order to e.g. move things into registers. Could you elaborate how (and how often) you would like to have this kind of UB-derived axiom ("this value remains the same from here to there") checked with assertions?


Obviously you wouldn't be able to catch many, or even most cases. Use-after-free is another case that would be very expensive to detect.


I was using it in 2000,

https://www.parasoft.com/products/parasoft-insure/

21 years later it is still an uphill battle to adopt such technology in C and C++ projects.


We had sanitizers since C exists, 1979 to be more exact.

"Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions. "

-- https://www.bell-labs.com/usr/dmr/www/chist.html


I was going to add "in open source compilers" to hedge my statement :).

That seems to be a static analysis tool though (which generally have not been great). Did it also inject runtime checks?


No, but still not even that gets the love it deserve.

And a famous commercial variant of it has been PC-lint from https://www.gimpel.com/.

Being available on open source compilers does little to change the culture, as per latest surveys only 11% of developers care to use any kind of tooling for improving their code quality in C and C++.

At CppCon a couple of years ago, only about 1% of the audience answered positively to Herb Sutters' question.


Those numbers are indeed a bit depressing.


Here is a recent survey, with a more positive number of about 37%, see question 10.

https://isocpp.org/files/papers/CppDevSurvey-2021-04-summary...


That's good example, because nobody would complain if stack layouts changed and those programs failed. But if the compiler chooses to "optimize away" checks on stack layout, that's a different thing altogether. Also note that if you use pthreads or Linux clone or you are writing an operating system you can need to rely on exact stack layouts even today.


Stack layouts are only really relevant at ABI boundaries. In these cases the layout is usually specified in extensions to C or in other ways, such as handwritten assembly.


Linux clone, pthreads, and os code commonly look at stack boundaries


Not sure what you are referring to with stack boundaries. Of course the ABI imposes some minimal requirements at ABI visible points, but these days you can't even rely on the existence of frame pointers to traverse the stack and you have to use the DWARF unwind machinery. And the content of the stack frame itself is completely unspecified of course.


So I create a thread with a custom stack which is an allocated buffer. At the top, I write a sequence of bytes in some order. Then I periodically read the top of the stack to see if the stack is getting close to overflow. Meanwhile, the thread code is also addressing the same store.


For your last point, the extent of UB driven changes to semantics is still not widely known in the programmer community. Programmers don't read the standard - they read K&R, and K&R is right now describing a different language. We've had 15 years of programmers repeatedly filing bug reports to be told that the expected, tested, relied on, behavior was ephemeral. Only very sophisticated projects figure out about UB.

Of course compilers have to make assumptions. The debate is (a) over what assumptions it is proper to make and (b) what are the permissible behaviors. The false dichotomy: either do without any optimizations at all or accept whatever UB gives you, is not a useful approach.


So what optimizations do you mean with "these UB optimizations" then? And would it change your mind to see a benchmark proving the usefulness of that particular UB optimization?


> So what optimizations do you mean with "these UB optimizations" then?

Inferring any propositional statement about the program (eg "this pointer is not null") from the fact that its negation would imply undefined behaviour.


e.g. assuming UB can't happen: deleting overflow or null pointer checks, deleting comparisons between pointers that are assumed to point at different objects, ...


I'm surprised you don't think deleting a bunch of checks will improve performance.


> most compilers provide a choice (-fwrapv, -fno-strict-aliasing, ...). Yet few projects use these options,

Even if you opt into those (and the projects I've been involved with have), the precedent is established: when new compiler optimizations are introduced with disruptive semantics, they are opt-out, not opt-in — a fail-dangerous failure mode.

> Doesn't that indicate that C programmers indeed want to sacrifice safety for optimizations?

Maybe so. Which is one reason I wouldn't call myself a "C programmer" any more. The demands to program responsibly in C are absurdly high.


> Even if you opt into those (and the projects I've been involved with have), the precedent is established: when new compiler optimizations are introduced with disruptive semantics, they are opt-out, not opt-in — a fail-dangerous failure mode.

Actually, the default for gcc is -O0. You are opting in with -O2 etc.


That is exactly how C should have kept being used, a portable macro assembler, while leaving everything else on the IT stack for more saner languages.

BCPL was anyway designed to Bootstrap CPL, nothing else.


> But nobody has ever shown … that these UB optimizations actually improve performance of anything.

That’s just not true. Already examples in this thread: https://news.ycombinator.com/item?id=27223870


To do UB "optimizations", the compiler first needs to figure out that there is an UB it can "optimize" anyway. At this point instead of "optimizing" it could, and in my humble opinion absolutely should, blow up the compilation by generating an UB error, so people can fix their stuff.

What about backwards compatibility in regards to a new compiler version deciding to issue errors on UB now? You don't have any guarantees about what happens with UB right now, so if you upgrade to a new version compiler that generates errors instead of "optimizations" everything would be still as before: no guarantees. And it's frankly a lot better to blow up the compilation with errors than to have the compiler accept the UB code and roll a dice on how the final binary will behave later. You can either fix the code to make it compile again, or use an older "known good" version of the compiler that you previously used as a stopgap measure.

I fail to see any reason whatsoever why compilers are still doing all kinds of stupid stuff with UB instead of doing the right thing and issuing errors when they encounter UB.

I also fail to see why the C language designers still insist on keeping so much of the legacy shit around.


> To do UB "optimizations", the compiler first needs to figure out that there is an UB it can "optimize" anyway.

The compiler assumes UB will never happen and it makes transformations that will be valid if there happens to be no UB. This doesn't require any explicit detection of UB, and in some cases UB or not is simply undecidable at compile time (as in no compiler could detect it without incorrect results).

Without these assumptions the resulting compiled code would be much slower, though some optimizations have different danger vs speed impact and there certainly can be a case that there are some optimizations that should be eschewed because they're a poor trade-off.

There are many cases where current compilers will warn you when you've done something that is UB. It's probably not the case that they warn for every such detectable case and if so it would be reasonable to ask them to warn about more of them.

I think your irritation is just based on a misunderstanding of the situation.

Compiler authors are C(++) programmers too, they also don't like footguns. They're not trying to screw anyone over. They don't waste their time adding optimizations that don't make real performance improvements just to trip up invalid code.


Yes, some UB are not decidable at compile time, but a lot could be easily speced to have a defined behavior at runtime, such as overflows.

The main reason to not spec these things is because people would be arguing "this makes compiled code on my esoteric 9-bit 1-complement chip slower" or "there was this chip in the 70s that did things differently" or "but a short int on Cray was 64-bit". Great, so now the spec has avoidable unnecessary undefined behavior all over the place, and the code other people wrote still does not run correctly on your 9-bit chip. Brought to you by the same people who decided "NULL is not necessarily (void*)0", and who define those integer types everybody uses (instead of stdint) with an "at least this big".

Yes, a lot of that is legacy stuff and was added to accommodate and model things that already existed (the wrong way to go about it, IMO, but hindsight is 20/20), but that's my argument: fix this stuff once and for all and for good in an upcoming spec iteration.

>Without these assumptions the resulting compiled code would be much slower

In some cases, this is true (for different levels of "much slower"), but the trade off here is still "running code that works, but a little slower" vs "running code that does not work and will launch a nuclear strike at Switzerland by accident, but really fast".

In a lot of cases, it will not be slower, or at least not much slower.

>I think your irritation is just based on a misunderstanding of the situation.

Frankly, not really. I started writing my first C (and C++) in the early 90s, and I think I do understand the situation pretty well by now. But I should have been more precise in my initial ranting comment, I give you that.

>They're not trying to screw anyone over.

I didn't say that they are.


Note that (void *)0 is always NULL, as mandated by the standard.

But, to address the content of your comment: defined behavior at runtime is not necessarily good behavior at runtime. Defining signed integer overflow to wrap, for example, is probably a bad idea, because this is rarely the intent of the code. Having all such operations trap might be a good idea, but now you're going to get the same "stop breaking my working programs" people angry at you.


Yes, thankfully at least with NULL they didn't fall into the legacy trap and messed up the standard with non-zero NULL that some machines before have been kind of using.

>Defining signed integer overflow to wrap, for example, is probably a bad idea

I wouldn't call it great behavior, but it's at least what most people expect will happen, and most people will be able to understand what's going on, and it's fast on most systems that matter. However, it's still undefined behavior. Just codifying overflows to be wrapping, would therefore be an improvement in my opinion, at least over what we have today.


I would say that most people expect it not to happen. If they really had to mandate a behavior, it should be to trap.


> Note that (void *)0 is always NULL, as mandated by the standard.

Also note that this is distinct from the memory representation of (void *)0 being all 0 bits, which is explicitly not mandated.


> To do UB "optimizations", the compiler first needs to figure out that there is an UB it can "optimize" anyway.

That's not how compwillrs work. In fact in the general case it is impossible to figure out at compile time that "there is an UB".

The compiler instead assumes as an axiom that no UB can ever happen and uses the axioms to prove properties of the code.

These days if you want to catch UB, compile with -fsanitize=undefined-behaviour. The program wll then trap if UB is actually detected at runtime.


> These days if you want to catch UB, compile with -fsanitize=undefined-behaviour. The program wll then trap if UB is actually detected at runtime.

So, let me get this straight, someone wants to make sure pointer p is not null (in the wrong way), and codes something like the examples in posts above like if (!p) ... and if that doesn't trigger calls use(*p), but compiler decides p can never be null because that would be UB and hence removes the check.

The C coder dumps the code and gets upset because the check is removed and gets the hint to catch UB by adding -fsanitize .. that "catches UB" in the above scenario so that the program will "trap if UB is detected".

I think we just came full circle there.

Sure, the -f will catch ALL detected bugs and so on, but I still found it a bit funny.


It is a bit different.

Ubsan will abort an invalid program if it detect ub. It doesn't let you handle it. So you shouldn't remove the erroneous check, but fix it so it is no longer erroneous, and ubsan will help you identify these errors.

Also ubsan adds significant overhead so it is not really appropriate for production builds unfortunately (hence my wish for a less powerful ubsan-lite but with lower overhead).


I think you are misunderstanding the situation. Given code like:

    if (!p) {
        use(*p);
    }
(given no previous knowledge about p) no compiler will remove the "if (!p)" part.

What people are complaining about is the opposite case:

    use(*p);

    /* The compiler reasons that if p == NULL, the program would have crashed by now,
       so if we got here, p != NULL must hold. */

    if (!p) {  // the compiler can remove this branch
        report_error();
    }


The caricatures are somewhat accurate though, optimizations that look at UB adversarially are never anywhere close to justified.

> The set of people doing so overlaps with the set of people complaining that the compiler doesn't optimize their code sufficiently to a significant degree.

There's no contradiction here, and the overlap is generally just "people who care". The optimizations that are not safe shouldn't exist, and the optimizations that are safe should be good.

> nearly all of the time are things the code author would agree with if they thought long and hard

I highly doubt this is the case for even one situation.


> optimizations that look at UB adversarially

The whole point is that there isn't such adversarial thing like "we're going to find the UB right there, won't even print a warning about it, and mess up your crappy code, haha!"

Optimizers aren't reasoning about code like people do (start to finish with high-level understanding of the whole function), but rather as series of mostly dumb, mostly isolated small passes, each pass changing one little thing about the code.

It just happens that one pass marks certain instructions as "can't happen" (like the spec says), then another pass simplifies expressions, and then another pass that removes code that doesn't do anything, usually left over from the previous steps. They sometimes combine in an "adversarial" way, but individually each pass is justified and necessary.

Compilers already have lots of different passes. Splitting optimizations into passes is a way to keep complexity closer to O(n) rather than O(n^2), but this architecture makes interactions between passes very delicate and difficult to coordinate, so it's difficult to instrument the data to avoid only cases of annoying UB without pessimizing cases that users want optimized.


Compiler dev here. Do you really think we just come up with code transformations and add them to the compiler just because?

All code transformations have a compile time cost and runtime perf impact. We don't add transformations unless the runtime perf impact greatly outweighs the compile time cost.

These optimizations are added because they measurably improve the performance of real code. This comes up in every review for new or updated optimization passes. This claim that they aren't justified is actually rather insulting to the effort put in to improve perf without taking days to compile.


However Fortran, Ada, Java, .NET Native, Swift,... are certainly less subject to such optimizations with surprises.


Name one such optimization. We'll be happy to refute your points for that one.


The author suggests that the text following the definition of "undefined behavior", listing the permitted or possible range of undefined behavior, should be read to restrict the consequences.

But the first possibility listed is "ignoring the situation completely with unpredictable results". Surely that covers any possible consequences.

The author also says:

> Returning a pointer to indeterminate value data, surely a “use”, is not undefined behavior because the standard mandates that malloc will do that.

Returning a pointer to data is not a use of that data. The fact that its value is indeterminate isn't relevant until you attempt to read it (without first writing it).

It may be worthwhile to reduce the number of constructs whose behavior is undefined, making them implementation-defined or unspecified instead. For example, if signed integer overflow yielded an unspecified result rather than causing undefined behavior, I wonder if any implementations would be adversely affected. (But it would remove the possibility of aborting a program that computes INT_MAX+1.)

I don't think reinterpreting "undefined behavior" as anything other than "the Standard imposes no requirements" is practical. If a program writes through a dangling pointer and, for example, clobbers a function's return address, what constraints could be imposed on what the program might do next?


> For example, if signed integer overflow yielded an unspecified result rather than causing undefined behavior, I wonder if any implementations would be adversely affected.

I suspect so - makes it harder to reason about loop counts because the compiler can't necessarily guarantee that an incremented loop counter won't become negative and thus the loop needs to iterate more.

E.g. something like for (int i=param; i < param + 16; i++) has a guaranteed loop count with the current rules, but not with yours?

That's not an excuse for but having any way to do proper overflowing operations on signed integers though.


That's the exact reason why this rule was introduced into the standard: it was so C compilers could compete with Fortran compilers (Fortran has similar rules and at the time they were beating C compilers on equivalent scientific codes by 2-3x).

Fortran has even more restrictive aliasing rules than C: a function is allowed to assume that any two array arguments passed as arguments do not overlap. If they do, the behavior is undefined.


Exactly - it was done for meaningless benchmarking reasons. C programmers would be happy to use "restrict" as an opt-in for those, but this argument about FORTRAN goes back to the initial days of the standard when Dennis Ritchie had to push "noalias" out of the proposed standard.


> I suspect so - makes it harder to reason about loop counts because the compiler can't necessarily guarantee that an incremented loop counter won't become negative and thus the loop needs to iterate more.

This is a favourite example that gets thrown around, but for all practical loops GCC and clang seem to have no problem even when you compile with -fwrapv


Postgres compiles with fwrapv for many years now, and yes, it does introduce a measurable CPU overhead. Not 10%, but also not just 0.1%.


link?


Locally run benchmarks.


I don't know if there exists a C compiler that leverages this feature but there are ISAs (for instance MIPS) that can trap on signed overflow.

The fact that it's UB in C means that you can tell the compiler to generate these exception-generating instructions, which could make some overflow bugs easier to track down without any performance implications. And your compiler would still be 100% compliant with the standard.

That being said I just tried and at least by default GCC emits the non-trapping "ADDU" even for signed adds, so maybe nobody actually uses that feature in practice.


That doesn't really help with the compiler optimization aspect : A typical use of the range information would be to unroll the loop - in which case there's no addition to trap on anymore.


To be fair, if you want to make sure that loop is unrolled even in the presence of -fwrapv, writing it as for (int i=0; i < 16; i++) {/* use i+param */} is a very simple change for you to make even today. You'll have to make much uglier changes to code if you're at the level of optimization where loop unrolling really matters for your code on a modern processor.


GCC is optimised for performing well on benchmarks at the expense of anything else. Vendor compilers for those architectures traditionally had more programmer-friendly features like trapping instead of creating an exploitable security vulnerability.


> GCC is optimised for performing well on benchmarks at the expense of anything else.

This is very wrong, and I don't know why you would come to this conclusion.

> Vendor compilers for those architectures traditionally had more programmer-friendly features like trapping instead of creating an exploitable security vulnerability.

GCC has this feature too.


Assuming you defined signed integer overflow to follow two’s complement rules (the only reasonable interpretation other than UB), it would still be a guaranteed loop count of 16. (EDIT: i’m a dumbass, this is obvs not true. disregard this paragraph)

There’s an interesting thing to note with that example though: even if you did make signed integer overflow defined, that code is still obviously incorrect if param + 16 overflows. Like, the fact that signed integer overflow is UB is totally fine in this example: making it defined behavior doesn’t fix the code, and if making it UB allows the compiler to optimize, then why not?

Arguably, this is the case with the vast majority of signed integer overflow examples: the UB isn’t really the issue, the issue is that the programmer didn’t consider overflow, and if overflow happens the code is incorrect regardless. Why cripple the compilers ability to optimize to protect cases which are almost certainly incorrect anyway?


The real problem is in a better world 'int' would be replaced by types that actually exhibit the correct behavior.

for a loop counter you want an index type that will seg fault on overflow. If you think not having that check is worth it the programmer would need to tag it with unsafe.

It's also problematic because it's size is defined as at least 16 bits. But programmers which means you should never use it to store a constant larger than 16 bits. But people do that all the time.


I’m not sure I agree. If signed overflow is UB, loops like this can be optimized the hell out of. The most obvious way would be to unroll it and eliminate the loop (and loop variable) entirely, but you can also do things like vectorize it, maybe turn it in to just a small number of SIMD instructions. The performance gains are potentially enormous if this is in a hot loop.

With your magic int that traps on overflow, you couldn’t do that if the compiler was forced to rely on that behaviour. This is exactly why signed overflow is UB in C, and I don’t think that’s an unreasonable case for a language like C.

To be clear, my point is that this program is incorrect if overflow happens regardless of whether overflow is UB or not. So you might as well make it UB and optimize the hell out of it.


The broader argument is that signedness of the integer type used for indexing is a non-obvious gotcha affecting vectorizability. It makes sense once you understand C integer semantics, but putting on a language designer hat, I'd go with something more explicit.


Many people write C programs that are not intended to be portable to 16-bit architectures.


for (int i=param; i < param + 16; i++)

does not have a guaranteed loop count with the current rules. The loop body will execute 16 times if param <= INT_MAX-16, but if the expression "param + 16" can overflow, the behavior is undefined. (I'm assuming param is of type int.)


> does not have a guaranteed loop count with the current rules. The loop body will execute 16 times if param <= INT_MAX-16, but if the expression "param + 16" can overflow, the behavior is undefined. (I'm assuming param is of type int.)

And the standard permits us (among other responses) to ignore undefined behaviour, so it does have a guaranteed loop count under a reading of the standard which the standard specifically and explicitly allows.


No, the standard permits the implementation to ignore the behavior "with unpredictable results".

If the value of param is INT_MAX, the behavior of evaluating param + 16 is undefined. It doesn't become defined behavior because a particular implementation makes a particular choice. And the implementation doesn't have to tell you what choice it makes.

What the standard means by "ignoring the situation completely" is that the implementation doesn't have to be aware that the behavior is undefined. In this particular case:

for (int i=param; i < param + 16; i++)

that means the compiler can assume there's no overflow and generate code that always executes the loop body exactly 16 times, or it can generate naive code that computes param + 16 and uses whatever result the hardware gives it. And the implementation is under no obligation to tell you how it decides that.


> that means the compiler can assume there's no overflow and generate code that always executes the loop body exactly 16 times

Right. That's what I said.

And just to be super-precise about the wording, the standard doesn't say "ignore the behavior 'with unpredictable results'" it says "Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results". Nitpicky, but the former wording could be taken to imply that ignoring behavior is only permissible if the behavior is unpredictable, when what the standard actually says is that you can ignore the behavior, even if the results of ignoring it are unpredictable.


And my point is that as far as the language is concerned, there is no guaranteed loop count under any circumstances. (An implementation is allowed, but not required, to define the behavior for that implementation.)


The two of you are not disagreeing except insofar as you're both using the word "guaranteed" to mean completely different things. _kst_, you're using it to mean "the programmer can rely on it". msbarnett, you're using it to mean "the compiler can rely on it".


> If the value of param is INT_MAX, the behavior of evaluating param + 16 is undefined. It doesn't become defined behavior because a particular implementation makes a particular choice. And the implementation doesn't have to tell you what choice it makes.

The compiler writer argument is as follows:

The program is either UB (when param is INT-MAX - 15 higher) or has exactly 16 iterations. Since we are free to give any semantics to a UB program, it is standard-compliant to always execute 16 times regardless of param's value.


in which case the overflow will cause the loop to change some random memory, but its ok since removing a single instruction test that is easy to pipeline is worth incorrect results!


Either the limit on param is guaranteed in some way by the rest of the program, or it is not. If it is, then the loop count is guaranteed in both cases. If it is not, the loop count is not guaranteed in either case.


That you wish that the C Standard mandated this interpretation does not change the fact that this is not what the C Standard says.


You are mistaken, the C standard is quite clear that it does not make any guarantees regarding the behavior of programs that exhibit undefined behavior, and that signed integer overflow is undefined behavior.


"for (int i=param; i < param + 16; i++) does not have a guaranteed loop count in the presence of undefined behavior" is true, but it's equally true that the C standard is quite clear that undefined behavior can be ignored, so we can validly treat "for (int i=param; i < param + 16; i++)" as if it were guaranteed to loop 16 times in all cases.


No, the C standard doesn't say that "undefined behavior can be ignored" (which would mean what, making it defined?).

It says, "NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, ...".

It doesn't say that the behavior can be ignored. It says that the undefinedness can be ignored. The implementation doesn't have to take notice of the fact that the behavior is undefined.

Let's take a simpler example:

    printf("%d\n", INT_MAX + 1);
The behavior is undefined. The standard does not guarantee anything about it. A conforming implementation can reject it at compile time, or it can generate code that crashes, or it can generate code that emits an ADD instruction and print whatever the hardware returns, or it can play roge at compile time. (The traditional joke is that can make demons fly out of your nose. Of course it can't, but an implementation that did so would be physically impossible, not non-conforming.)

An implementation might define the behavior, but it's still "undefined behavior" as that term is defined by the ISO C standard.


"undefined behavior can be ignored" (meaning: the case where this could overflow need not be considered and can be treated as though it does not exist) vs "The implementation doesn't have to take notice of the fact that the behavior is undefined" strikes me as a distinction without a difference given that we land in exactly the same spot: the standard allows us to treat "for (int i=param; i < param + 16; i++)" as if it were guaranteed to loop 16 times in all cases.

> An implementation might define the behavior, but it's still "undefined behavior" as that term is defined by the ISO C standard.

The point where we seem to disagree (and the pedantry here is getting tiresome so I don't know that there's any value in continuing to go back and forth of on it) is that yes, it's undefined behavior by the ISO C standard. BUT, the ISO C standard also defines the allowable interpretations of and responses to undefined behaviour. Those responses don't exist "outside" the standard – they flow directly from it.

So it's simultaneously true that the standard does not define it and that the standard gives us a framework in which to give its undefinedness some treatment and response, even if that response is "launch angband" or, in this case, "act as if it loops 16 times in all cases".


Of course an implementation can do anything it likes, including defining the behavior. That's one of the infinitely many ways of handling it -- precisely because it's undefined behavior.

I'm not using "undefined behavior" as the English two-word phrase. I'm using the technical term as it's defined by the ISO C standard. "The construct has undefined behavior" and "this implementation defines the behavior of the construct" are not contradictory statements.

And "ignoring the situation completely" does not imply any particular behavior. You seemed to be suggesting that "ignoring the situation completely" would result in the loop iterating exactly 16 tyimes.


> Of course an implementation can do anything it likes, including defining the behavior. That's one of the infinitely many ways of handling it -- precisely because it's undefined behavior.

An implementation can do whatever it likes within the proscribed bounds the standard provides for reacting to "undefined behavior", and conversely whatever the implementation chooses to do within those bounds is consistent with the standard.

Which, again, is the entire point of this: "the loop iterates exactly 16 times" is a standards-conforming interpretation of the code in question. There's nothing outside the standard or non-standard about that. That is, in fact, exactly what the standard says that it is allowed to mean.

> I'm not using "undefined behavior" as the English two-word phrase. I'm using the technical term as it's defined by the ISO C standard.

So am I. Unlike you, I'm merely taking into account the part of the standard that says "NOTE: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results..." and acknowledging that things that do so are standards-conforming.

> You seemed to be suggesting that "ignoring the situation completely" would result in the loop iterating exactly 16 tyimes.

I'm merely reiterating what the standard says: that the case in which the loop guard overflows can be ignored, allowing an implementation to conclude that the loop iterates exactly sixteen times in all scenarios it is required to consider.

All you seem to be doing here is reiterating, over and over again, "the standard says the behavior of the loop is undefined" to argue that the loop has no meaning, while ignoring that a different page of the same standard actual gives an allowable range of meanings to what it means for "behavior to be undefined", and that therefore anyone of those meanings is, in fact, precisely within the bounds of the standard.

We can validly say that the standard says "for (int i=param; i < param + 16; i++)" means "iterate 16 times always". We can validly say that the standard says "for (int i=param; i < param + 16; i++)" means "launch angband when param + 16 exceeds MAX_INT". Both are true statements.


> the standard allows us to treat "for (int i=param; i < param + 16; i++)" as if it were guaranteed to loop 16 times in all cases.

The standard allows this, but the standard also allows iterating less than 16 times, or turning it into an infinite loop, or doing things that a programmer can’t actually do intentionally inside the language’s rules. Undefined means “nothing is defined.” It doesn’t mean “nothing is defined, but in an intuitive way.”


They're not mistaken. What compilers will do is assume that UB don't happen. If no UB happens, that means `param + 16` never overflowed, therefore there are always exactly 16 operations.


Or they assume "param + 16" will never overflow, so they emit an ADD instruction and use whatever result it yields.

Saying that a compiler "assumes" anything is anthropomorphic. A compiler may behave (generate code) in a manner that does not take the presence or absence of undefined behavior into account. If you just say it assumes something, that doesn't tell you what it will do based on that assumption.

Generating code that yields exactly 16 iterations is one of infinitely many possible consequences of undefined behavior.

If the mathematical value of `param + 16` exceeds `INT_MAX`, then the code has undefined behavior. The C standard says nothing at all about how the program will behave. A conforming compiler can generate code that iterates 42 times and then whistles Dixie. The non-normative note under the definition of "undefined behavior" does not constrain what a conforming implementation is allowed to do.

"imposes no requirements" means "imposes no requirements".


Perhaps there's an implicit quantifier here: "for all valid implementations of the C standard, the loop count is guaranteed to be 16" versus "there exists a valid implementation of the C standard in which...".

(This line of thought inspired by RankNTypes, "who chooses the type", etc.)


Perhaps there's an implicit quantifier here: "for all valid implementations of the C standard, the loop count is guaranteed to be 16" versus "there exists a valid implementation of the C standard in which...".


That's precisely my point? Because the overflow case is undefined, the compiler can assume it doesn't happen and optimize based on the fixed loop count.


The overflow case is not UB. param can be unsigned, of fwrapv may be declared. Or the compiler chooses to declare fwrapv by default. In no case is the compiler allowed to declare the overflow away, unless it knows from before that param can not overflow. The optimization on loop count 16 can still happen with a runtime guard.


The loop counter is signed even if param is not, so i++ could overflow. fwrapv is a compiler flag, it is not part of the standard: it is a flag that mandates a certain behaviour in this case, but in standard C, the loop variable overflowing is definitely UB. No runtime guard needed, C compilers are just allowed to assume a fixed length. This is the whole reason signed overflow is UB in C, for exactly cases like this.


If param is unsigned, then "param + 16" cannot overflow; rather, the value wraps around in a language-defined manner. I've been assuming that param is of type int (and I stated that assumption).


The compiler is allowed to act as if this loop executes exactly 16 times. That means it could unroll and vectorize it for example.


It is completely useless to allow compilers to assume false things about the code they generate.


It’s not useless. The assumption is not false if the program doesn’t have undefined behavior. The assumption allows the code to be a few times faster. To disallow this assumption would inhibit these optimizations.


a) the assumption is not false if it is not false! b) the speedup is not shown anywhere


The speedup is more or less the difference between O1 and O3 optimization levels.


> For example, if signed integer overflow yielded an unspecified result rather than causing undefined behavior, I wonder if any implementations would be adversely affected.

You don't need to wonder. You can use -fwrapv to make signed integer overflow defined behavior.

C++20 introduced the guarantee that signed integers are two's complement. The original version of that proprosal also defined the behavior on overflow; but that part was rejected (signed integer overflow remains UB): http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p090... So at least the committee seems to think that the performance advantages are worth it.


> For example, if signed integer overflow yielded an unspecified result rather than causing undefined behavior, I wonder if any implementations would be adversely affected.

Yes.

There are several architectures where signed integer overflow traps, just like division by 0 on x86. (which is why division by 0 is UB) If a C compiler for those architectures was required to yield an unspecified result instead of trapping, every time the code performed a signed integer addition/subtraction, it would need to update a trap handler before and afterward to return an unspecified value instead of invoking the normal trap handler.


The author suggests that the text following the definition of "undefined behavior", listing the permitted or possible range of undefined behavior, should be read to restrict the consequences.

But the first possibility listed is "ignoring the situation completely with unpredictable results". Surely that covers any possible consequences.

Absolutely not. In the C89 standard, undefined behavior becomes undefined *UPON USE OF* the thing that is undefined. In current compilers, the existence of undefined behavior anywhere in your program is an excuse to do anything that the compiler wants to with all of the rest of your program. Even if the undefined behavior is never executed. Even if the undefined behavior happens after the code that you have encountered.

So, for example, undefined behavior that can be encountered within a loop makes it allowable to simply remove the loop. Even if the undefined behavior is inside of an if that does not happen to evaluate to true with your inputs.


This is actually desired though, at least by some programs. For example, say you have a function with a very expensive loop that repeatedly performs a null check and then executes some extra code if it's null, but never sets the value. This is called from another function which uses the checked value without a null check (proving it's not null) before and after the loop ends. The first function is inlined. You want to tell the compiler not to optimize out the null check and extra code in the loop? Or that it can't optimize stuff out to reuse the value from the first use of the value? If so, what is the compiler allowed to optimize out or reorder?

Now, to see why this might actually produce a bug in working code--say some other thread has access to the not-null value and sets it racily (non-atomically) to null. Or (since most compilers are super conservative about checks of values that escape a function because they can't do proper alias analysis), some code accidentally buffer overflows and updates the pointer to null while intending to do something else. Suddenly, this obvious optimization becomes invalid!

Arguments to the effect of "the compiler shouldn't optimize out that loop due to assuming absence of undefined behavior" are basically arguments for compilers to leave tons of performance on the table, due to the fact that sometimes C programs don't follow the standard (e.g. forgetting to use atomics, or indexing out of bounds). While it's a legitimate argument, I don't think people would be too happy to find their C programs losing to Java in benchmarks on -O3, either.


There may be programs that desire such behavior. But I've never intentionally written one. Which is why I personally avoid C, and wish that I didn't have to work in environments coded in C.

I seriously would accept everything running at half speed for the certainty of not being subject to the problems of C level bugs. But as Rust grows in popularity, it looks like I won't need to worry about that.


> I seriously would accept everything running at half speed for the certainty of not being subject to the problems of C level bugs.

I think most people would. But the described code is still buggy even when it's not optimized.


Well, any code that triggers undefined behavior is already buggy by definition. I think it would be a lot more fruitful if, instead of blaming compilers for doing their job (trying to optimize code in a language that allows all sorts of potentially unsafe behavior), people enumerated the specific UB they had issues with. For example, a lot of people don't consider integer overflow, too-large bitshift, nonterminating loops, type punning without union, or "benign" data races automatic bugs in themselves. Some people don't even consider a null pointer dereference an automatic bug (but what about a null pointer field access, or array index that happens to land on a non-null page? Is the compiler allowed to optimize field accesses to pointer arithmetic, or not?).

Anyway this is all fine, but as you can imagine you lose a lot of optimizations that are facilitated by all that UB, so the compiler authors should then counter with some way to signal that you want the original undefined semantics (for instance, references in C++ and restrict pointers in C), or provide compile-time checking to prevent misuse that messes up optimizations (e.g. Rust's Send+Sync for avoiding data races, or UnsafeCell for signaling lack of restrict semantics / raw pointers for lack of non-nullability).


> So, for example, undefined behavior that can be encountered within a loop makes it allowable to simply remove the loop. Even if the undefined behavior is inside of an if that does not happen to evaluate to true with your inputs.

The last sentence is not true. If there is UB inside the if, the compiler may assume that the if condition never evaluates to true (and hence delete that branch of the if), but it may certainly not remove the surrounding loop (unless it can also prove that the condition must be true).


> In current compilers, the existence of undefined behavior anywhere in your program is an excuse to do anything that the compiler wants to with all of the rest of your program. Even if the undefined behavior is never executed.

This is…complicated. Let's say you have an array of ten numbers, and then you take user input and use that to index into the array. This program is well-formed…as long as the user never inputs a number beyond ten. If they do, then the program is invalid. In general, the presence of undefined behavior is an attribute of the running program, not the source code itself. If there exists any execution where only defined behavior, the compiler may not deviate from the standard. However, what you probably meant is behavior in the face of the existence of runtime undefined behavior, in which case you are correct that a compiler could write clairvoyant code that refuses to execute the first instruction if it knows that UB will happen at some point in the program.


> In the C89 standard, undefined behavior becomes undefined UPON USE OF the thing that is undefined.

Is that still the case for current C standards, or did something change in C99/C11?


I don't have the C11 standard. But that part of the passage remained unchanged in C99.

In C89 there was a list of PERMISSIBLE things that compilers could do upon encountering undefined behavior. In C99 that was changed to a list of POSSIBLE things. And compilers have taken full advantage of that.


Ah. That sounds like the argument made in "One Word Broke C" [0, 1]. I can't say I agree with that argument, though.

As pointed out here and in the HN comments on that article, the phrase "ignoring the situation completely with unpredictable results" is present in both those standards, and is arguably what allows aggressive compiler optimizations to be made, since to a first approximation those optimizations rely on ignoring control flow that encounters UB.

[0]: https://news.quelsolaar.com/2020/03/16/how-one-word-broke-c/

[1]: https://news.ycombinator.com/item?id=22589657


e.g. removing a check for for overflow is definitely NOT ignoring the behavior. Deleting write because it would be undefined behavior for a pointer to point at some location is also NOT ignoring the behavior. Ignoring the behavior is exactly what the rationale is describing when it says UB allows compilers to not detect certain kinds of errors.

Returning a pointer is certainly a use. In any event, the prevailing interpretation makes it impossible to write a defined memory allocator in C.

If a program writes through a dangling pointer and clobbers a return address, the programmer made an error and unpredictable results follow. C is inherently memory unsafe. No UB based labrynth of optimizations can change that. It is not designed to be memory safe: it has other design goals.


> e.g. removing a check for for overflow is definitely NOT ignoring the behavior. Deleting write because it would be undefined behavior for a pointer to point at some location is also NOT ignoring the behavior.

Depending on how you look at it, this is ignoring the behavior.

For example, say you have this:

    int f(int a) {
        if (a + 1 < a) {
            // Handle error
        }
        // Do work
    }
You have 2 situations:

    1. a + 1 overflows
    2. a + 1 does not overflow
Situation 1 contains undefined behavior. If the compiler decides to "ignor[e] the situation completely", then Situation 1 can be dropped from consideration, leaving Situation 2. Since this is the only situation left, the compiler can then deduce that the condition is always false, and a later dead code elimination pass would result in the removal of the error handling code.

So the compiler is ignoring the behavior, but makes the decision to do so by not ignoring the behavior. It's slightly convoluted, but not unreasonable.


More than slightly convoluted. The obvious intention is that the compiler ignores overflow and lets the processor architecture make the decision. Assuming that overflow doesn't happen is assuming something false. There's no excuse for that and it doesn't "optimize" anything.


> The obvious intention is that the compiler ignores overflow and lets the processor architecture make the decision.

If that were the case, wouldn't signed overflow be implementation-defined or unspecified behavior, instead of undefined behavior?

> Assuming that overflow doesn't happen is assuming something false.

It's "false" in the same way that assuming two restrict pointers don't alias is "false". It may not be universally true for every single program and/or execution, but the compiler is explicitly allowed to disregard cases where the assumption may not hold (i.e., the compiler is allowed to "ignor[e] the situation completely").

And again, the compiler is allowed to make this assumption because undefined behavior has no defined semantics. If the compiler assumes that no undefined behavior occurs, and undefined behavior does occur, whatever happens at that point is still conforming, since the Standard says that it imposes no requirements on said program.

> it doesn't "optimize" anything.

...But it does allow for optimizations? For example, assuming signed overflow can allow the compiler to unroll/vectorize loops when the loop index is not the size of a machine word [0]. Godbolt example at [1].

[0]: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...

[1]: https://godbolt.org/z/sP6WYPeT7


> If that were the case, wouldn't signed overflow be implementation-defined or unspecified behavior, instead of undefined behavior?

No, because (among other reasons) the processor architecture might decide to trap or not trap depending the run-time values of configuration registers that the compiler doesn't know and can't control or document.


> the processor architecture might decide to trap or not trap depending the run-time values of configuration registers that the compiler doesn't know and can't control

I'm not certain that that would fall outside implementation-defined behavior. Would something like "Program behavior on overflow is determined by processor model and configuration" not work?

> or document.

And even if the behavior couldn't be documented, that could be covered by unspecified behavior (assuming the language in the C standard is the same as in the C++ standard in this case)


> Would something like "Program behavior on overflow is determined by processor model and configuration" not work?

Not sure; if nothing else, that seems like it would allow the implementation to avoid documenting any implementation-defined behaviour with a blanket "all implementation-defined behaviour is whatever the hardware happens to do when executing the relevant code".


I mean, that works? It's not great by any means, but it at least eliminates the ability to make the assumptions underlying more aggressive optimizations, which seems like it'd address one of the bigger concerns around said optimizations.


Perhaps I should have phrased it as "all implementation-defined behaviour is whatever the hardware happens to do when executing whatever code the compiler happens to generate".

The point of implementation-defined behaviour is that the implementation should be required to actually define the behaviour. Whereas undefined behaviour doesn't impose any requirements; the implementation can do whatever seems reasonable on a given hardware architechure. That doesn't mean that backdoor-injection malware pretending to be a implementation is a conforming implementation.


> Perhaps I should have phrased it as "all implementation-defined behaviour is whatever the hardware happens to do when executing whatever code the compiler happens to generate".

Even with this definition, the important part is that compilers would no longer be able to ignore control flow paths that invoke undefined behavior. Signed integer overflow/null pointer dereference/etc. may be documented to produce arbitrary results, and that documentation may be so vague as to be useless, but those overflow/null pointer checks are staying put.


Err, that's not a definition, that's a example of pathologically useless 'documentation' that a perverse implementation might provide if it were allowed to 'define' implementation-defined behaviour by deferring to the hardware. Deferring to the hardware is what undefined behaviour is, the point of implementation-defined behaviour is to be less vague than that.

> may be documented to produce arbitrary results, and that documentation may be so vague as to be useless, but those overflow/null pointer checks are staying put. [emphasis added]

Yes, exactly; that is what undefined behaviour is. That is what "the standard imposes no requirements" means.


> Deferring to the hardware is what undefined behaviour is

If that were the case, the Standard would say so. The entire reason people argue over this in the first place is because the Standard's definition of undefined behavior allows for multiple interpretations.

In any case, you're still missing the point. It doesn't matter how good or bad the documentation of implementation-defined behavior may or may not be; the important part is that compilers cannot optimize under the assumption that control flow paths containing implementation-defined behavior are never reached. Null-pointer checks, overflow checks, etc. would remain in place.

> Yes, exactly; that is what undefined behaviour is. That is what "the standard imposes no requirements" means.

I think you're mixing standardese-undefined-behavior with colloquial-undefined-behavior here. For example, if reading an uninitialized variable were implementation-defined behavior, and an implementation said the result of reading an uninitialized variable was "whatever the hardware returns", you're going to get some arbitrary value/number, but your program is still going to be well-defined in the eyes of the Standard.


> the important part is that compilers cannot optimize under the assumption that control flow paths containing [undefined] behavior are never reached.

Yes. That. Exactly that. Compilers cannot assume that, because (in the general case) it is not true.


> Yes. That. Exactly that.

When I said implementation-defined, I meant implementation-defined. This is because the applicability of UB-based optimization to implementation-defined behavior - namely, the lack thereof - is wholly uncontroversial. Thus, the diversion into the quality of documentation-defined behavior is not directly relevant here; the mere act of changing something from undefined behavior to implementation-defined behavior neatly renders irrelevant any argument about whether any particular UB-based optimization is valid.

> Compilers cannot assume that, because (in the general case) it is not true.

This is not necessarily true. For example, consider the semantics of the restrict keyword. The guarantees promised by a restrict-qualified pointer aren't true in the general case, but preventing optimizations because of that rather defeats the entire purpose of restricting a pointer in the first place.

More generally, the entire discussion about UB-based optimizations exists precisely because the Standard permits a reading such that compilers can make optimizations that don't hold true in the general case, precisely because the Standard imposes no requirements on programs that violate those assumptions.


I think the author of that blog was correct: the preferred path is for the compiler to provide data to the programmer to simplify the loop.

For your godbolt example, use the C compiler not c++


> I think the author of that blog was correct: the preferred path is for the compiler to provide data to the programmer to simplify the loop.

Requiring the equivalent of PGO is a rather unfortunate bar, though to be fair if you're that interested in performance it's probably something worth looking into anyways.

I'm curious how how noisy an always-on warning for undersized loop variables would be, or how much code would have broken if int were changed to 64 bits on 64-bit platforms...

> For your godbolt example, use the C compiler not c++

Sorry; that was a mistake on my end. The same phenomenon occurs when compiling in C mode, in any case [0].

[0]: https://godbolt.org/z/TvYrzncsc


IMO, implementation defined is worse. It is still a time bomb but now it is a time bomb that you cannot use compiler errors to prevent automatically.


How so? The implementation can, and perhaps should, define that it errors. Whatever behaviour you're worried about a compiler doing for implementation-defined behaviour, it could do exactly the same thing if the behaviour was undefined.


Implementation defined behavior can only ever produce compiler warnings, which you can choose to be commit blockers if you want. But if a compiler can prove that UB can happen then it can completely prevent you from building that program.


> But if a compiler can prove that UB can happen then it can completely prevent you from building that program.

Not really; the C standard requires implementations to have particular behaviour for executions which do not encounter undefined behaviour, so an implementation still has to do the right thing for valid cases. So if there's even one possible set of user input etc. for which the program has defined behaviour then a compiler has to produce an executable.


Unspecified result means the compiler must think about what could happen in case I made an error.

UB means the compiler will trust me and concentrate on generate the fastest code ever.

C is for clever programmers; if you don't want to be clever, you are free to use Go or something like that.


> UB means the compiler will trust me and concentrate on generate the fastest code ever.

In reality, UB means the compiler will assume it doesn't happen and work from there.

Of course a more expressive language could just make it so the compiler doesn't have to assume this e.g. a C compiler will consider a dereference as meaning the pointer is non-null, both backwards and forwards.

But if the language had non-null pointers, it would not need to bother with that, it would have a non-null pointer in the first place. It could still optimise nullable pointers (aka lower nullable pointers to non-nullable if they're provably non-nullable, usually after a few rounds of inlining), but that would be a much lower priority.


Expecting programmers to evaluate their own cleverness does not work. Every nontrivial C program has undefined behaviour, making it a security flaw waiting to happen - I've been in these kind of debates where a C advocate will claim that program X is correct, and literally every time it turns out that program X has undefined behaviour somewhere.


The only programmers I would trust to write safe C are the ones who wouldn't trust themselves to write safe C.


It's not so much about cleverness, but knowledge and vigilance. You first have to be aware of all the footguns, and then be careful not to let any of them slip through...


> You first have to be aware of all the footguns,

Knowing your tools is part of being a professional. C is not for amateurs.


Then use such a tool, but don't call it C, rather -std=gnuc-opt11, which always knows better than the author, without any warning.

Call it randomC, unsuitable for professional programmers, but extremly suitable for benchmark games and managers. Who prefer to ignore pesty overflows, underflows, memset, memcpy, dereferencing NULL pointers and other rare cases.


A true mark of a professional is being deathly terrified of the footguns that you must nevertheless use as tools.


Most of this discussion revolves around integer overflow.

Part of the problem is that most of the computer hardware is now twos-complement arithmetic. Programmers think of that as part of the language. It's not, for C. C has run, in the past, on

- 36 bit ones complement machines (DEC and UNIVAC)

- Machines with 7-bit "char" (DEC)

- Machines with 9-bit "char" (UNIVAC, DEC)

- Machines where integer overflow yields a promotion to float (Burroughs)

- Many GPUs provide saturating arithmetic, where INT_MAX + 1 == INT_MAX.

Go, Java and Rust have explicit byte-oriented models with defined overflow semantics, but C does not. C has undefined overflow semantics.

Many years ago, around the time Ada was being defined, I wrote a note titled "Type Integer Considered Harmful". I was pushing the idea that integer variables should all have explicit range bounds, not type names. As in Ada, overflow would be checked. Intermediate variables in expressions would have bounds such that the intermediate value could not overflow without the final result overflowing and being caught. Intermediate values often have to be bigger than the operands for this to work.

This never caught on, partly because long arithmetic hardware was rare back then, but it illustrates the problem. Numeric typing in programming languages addresses a numeric problem with a linguistic solution. Hence, trouble. Bounds are the right answer numerically, but force people to think too hard about the limits of their values.


Explicit bounds everywhere mean that those bounds need to be automatically checked every time instead of only where explicitly specified by the programmer. This leads to safer code at a nontrivial performance tradeoff, one which would not be acceptable for C.


You could imagine bounds being baked into the types and checked at compile time.

    int{0..10}  foo  = 4
    int{0..10}  bar  = 5
    int{0..10}  buzz = foo+bar                // ERR: 10+10 potentially > 10
    int{0..10}  boom = wrapping_add(foo, bar) // OK
    int{0..100} sum  = foo+bar                // OK
There are tools which can do this, for example Code Contracts in C#. It becomes rather tedious and verbose so is something usually only left for very safety critical code.


This could be useful in C for some situations. That said, the above is completely useless if just only one of those values (ranges or assignments) is dynamic: you'll be back to carefully (and manually) checking you don't overflow.*

*edit: unless your compiler automatically emits instructions to do the check at runtime; a thing that won't happen (and I don't want) in C.


Something that Ada, Modula and Pascal compilers are able to optimize away in most cases.

C keeps targeting an hardware model that even most Arduinos are super computers by comparisasion.


I've heard of wrap around and saturate, but promote to float? What?! Do you have more info on how that worked?


48-bit numbers. "The internal format of a single-precision floating-point number consisted of one unused bit, followed by the sign of the number, then the sign of the exponent, then a six-bit exponent, then 39-bit mantissa. The bias of the exponent was such that it could be considered to be in excess-32 notation as long as the mantissa was considered to be a binary integer instead of a binary fraction. This allowed integers to also be interpreted as unnormalized floating-point numbers."[1]

So there was only one numeric format at the hardware level. Convenient.

[1] http://employees.oneonta.edu/zhangs/csci201/general%20Floati...


Same way it does in Javascript?


Quoting from the same passage in the standard as the article does:

> Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to ...

Ignoring the situation completely with unpredictable results seems like it covers the current compiler behavior.

The author does not like how current compilers work. But his argument against it mixes "it would be better if it worked differently" with "A specific pedantic reading of the standard says they are wrong". The second kind of argument seems to undercut his wider point. For his wider point is "Compilers should be reasonable rather than turning on pedantry". At least, that is what I think his point is, and it seems like the much stronger argument to me.

Trying to "trip up" the proponents of current behavior by pointing to a possible miss-reading of a comma is not going to do much. Arguing instead that their practice is harmful seems like a much more likely to work approach. That said, such an argument should probably be civil. The article links to this [1] discussion. The link is supposed to show the peculiar arguments used by proponents of current behavior. What I read there is someone lashing out, calling names, and grand-standing. Convincing compilers to be more reasonable is probably going to require a very different tone. Not one of "how dare you be so stupid" but one of "perhaps you could consider this side-effect of current behavior" and "have you considered this approach".

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475


> The article links to this [1] discussion. The link is supposed to show the peculiar arguments used by proponents of current behavior. What I read there is someone lashing out, calling names, and grand-standing.

Even worse, the person who raised that issue is mostly just wrong: the code they wrote had not worked with any optimizations turned on for about 13 years at the time they raised that bug. Teh fact that it worked in debug builds seems irrelevant, so the whole bug is complaining about a change that had no impact on the 99% of released code which is compiled with -O2 or -O3.


> ignoring the situation completely with unpredictable results

That needn't mean "delete the code".

TFA is certainly right that this situation is killing C. But then, so is C's old type model. Rust is clearly better, so it's just as well.

That said, I think the C99 UB stance is a disaster for any language that might adopt it.

Perhaps the standard should not define `1<<32`, or `(char )0`, or signed integer overflow. But it's easy enough for a compiler to implement shift overflow as 0 (or maybe -1 if the shifted number is signed and negative), it's easy enough to perform signed arithmetic and let the emitted instructions do what they will, and it's easy enough to allow NULL dereferences (maybe the app mapped a page at address 0 for some reason, or maybe you want to catch a segfault).

> But his argument against it mixes "it would be better if it worked differently" with "A specific pedantic reading of the standard says they are wrong".

I mean, if it works to shame compiler teams into serving their communities better, I'm all for it.

> The second kind of argument seems to undercut his wider point.

No, it's a plausible argument. It's like arguing about what the Founding Fathers meant by some specific clause in the Constitution.

> pointing to a possible miss-reading of a comma is not going to do much

That happens all the time in statutory and constitutional interpretation by courts. It's the same here. We're arguing the spec.

> Convincing compilers to be more reasonable is probably going to require a very different tone.

No, it's just not possible at all, tone or no tone. The right answers were always:

  1. write your own compiler that does it right
     (ETOOHARD now that clang didn't)
  2. start over with a new language
     (Hello Rust)
Unsurprisingly we've ended up with (2).


> 1. write your own compiler that does it right (ETOOHARD now that clang didn't)

The problem here is that "right" is not quite that black-and-white. To a lot of users, "right" is "my code appears to work," optionally with "at a given performance level." A new compiler that's slower, or changes semantics compared to some baseline, won't necessarily be seen as "better" unless there's some clear benefit, and "doesn't perform optimizations that may result in security holes" wasn't necessarily a clear benefit at the time clang was trying to gain traction. "Better error messages" and "faster compiles", on the other hand, were, and those were some of the reasons Clang gained adoption despite (sometimes?) producing slower programs.

It's somewhat analogous to Microsoft's dedication to backwards compatibility. If a program did something technically illegal and a later version of Windows broke the program, users tend to blame Microsoft and/or Windows, not the program. Same thing here - if Clang didn't perform such aggressive optimization and was slower than GCC because of it, users will tend to think "Clang is slow", not "GCC is making nonsensical optimizations" or "My program invokes UB".


This is only a matter of opinion because this mistake was made at all in the spec. If it hadn't been you'd not be saying any of the above. Most UB in C has reasonable implementation. E.g., `sizeof(enum_type)` should be 4 if its value range permits it, else 8. E.g., 1-bit int bitfields should behave like 1-bit unsigned int bitfields. I already covered bit shifting, signed integer overflows, and NULL dereferences, which covers... most things. OK, there's aliasing, but that was a terrible mistake. Really, it's not that hard.

"Clang is slow" would not be a result. "Clang produces slow code" might be due to it not deleting "unpossible" code, but screw that, you can always go delete it yourself if it was dead. The compiler deleting such code is a terrible trap.


> If it hadn't been you'd not be saying any of the above.

Well, yes, that's kind of the point. If GCC adopts that interpretation and gets a speed boost out of it, then there's pressure on Clang to do the same, unless they can convince developers that not making such aggressive optimizations is worth it. "Your code may be slower so this (arguably) edge case behaves better, but you get better error messages/compile times" is not going to be as easy to sell as "Your code will behave the same and perform as well as when compiled with GCC, and you get better error messages/compile times to boot".

> "Clang produces slow code" might be due to it not deleting "unpossible" code, but screw that, you can always go delete it yourself if it was dead.

The entire point of the dead code elimination pass is to do that for you. If anything, that's why the optimizer exists; so you don't have to perform optimizations manually.


I'm not convinced. The argument seems to hinge to a very large extent on the sentence:

> Permissible undefined behavior ranges from A, to B, to C.

The observation that "Permissible" has a specific meaning is important and interesting. But what about "ranges from ... to ..."? The author reads this as "Permissible undefined behavior is either A or B or C.", but that seems like a stretch to me. (Unless ISO defines "ranges" to mean just this.)

Also, in the above, A = "ignoring the situation completely with unpredictable results". Both "ignoring" and "unpredictable" do a lot of heavy lifting here. In the signed integer overflow case discussed in the linked GCC bug report (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475), one could very well argue that "ignoring" a case where signed arithmetic overflows is exactly GCC is doing. It "ignores" the possibility that a + 100 might overflow, hence obviously a + 100 > a is true, leading to results that the reporter considers "unpredictable". Somehow the author seems to think that this is not what the wording intended, but they also fail to explain what they think should happen here.

Sounds to me like the wording has always been very vague.

Also:

> Returning a pointer to indeterminate value data, surely a “use”, is not undefined behavior because the standard mandates that malloc will do that.

Yes, malloc will do that, but using a pointer to indeterminate data is not the same as using the indeterminate data itself. The author is doing themselves a disservice by misreading this.


The original intent was for signed overflow to be architecture-specific (not everything was 2s complement back then).

On x86, the correct behavior was previously (and obviously) “perform a 2s complement wraparound”.

To “ignore” the situation used to mean “delegate to a lower level”. It now means “silently generate code that violates the semantics of the target microarchitecture”

As the article argues, I think this has gone too far. For example, people are starting to claim that asm blocks are undefined behavior. They’re clearly implementation specific (so undefined by the spec), but also well defined by each implementation.

In current readings of the spec, compilers are free to ignore them completely. Doing so would break all known operating systems, and many user space programs, so they have not managed to do so yet.

Edit: for signed overflow, other architecture-specific behavior (such as optionally trapping floating point exceptions) would also have been permissible, assuming the architecture supported it.


> The original intent was for signed overflow to be architecture-specific (not everything was 2s complement back then).

The term for that is implementation-defined, not undefined. If it were to be architecture-specific, back in C89, they would have used the term implementation-defined, as they do for things like the size of pointers.


There's "implementation defined" for the concept you describe.


>Somehow the author seems to think that this is not what the wording intended, but they also fail to explain what they think should happen here.

What "should" happen is that the implementation-defined behavior in the assert matches the implementation-defined behavior in the calculation. That is, assert(a+100>a) should produce the same result as

  int x = a + 100;
  int y = a;
  if (x > y)
      ;
  else
      printf("assertion failed\n");


That behaviour is no less undefined in the process of overflow.


Okay, then let's rephrase without a code example: The implementation-defined behavior in the assert should produce "true" if and only if the number printed for x+100 (also using implementation-defined behavior) is always larger than the number printed for x.


The author seems to be missing this essential text: "the implementor may augment the language by providing a definition of the officially undefined behavior."

Making a system call is undefined behavior in the C standard, but it's not undefined behavior in clang-on-FreeBSD, because the implementors of clang on FreeBSD have defined what those system calls do.

Ditto for "asm" (UD unless/until you're running on a compiler which defines what that does), all of the tricks which make "malloc" work, and all of his other examples of acceptable uses of code which the C standard does not define.


The C standards have the perfectly fine name "implementation dependent" to describe those things. Undefined behavior is much less constrained than implementation dependent, adn thus more problematic.


> The C standards have the perfectly fine name "implementation dependent" to describe those things.

That term is not used by the C standards. Do you mean "implementation-defined"? asm is not among the explicitly specified implementation-defined behaviors, it's listed under "Common extensions". I don't see any mention at all of syscalls in C99. (I'm working with http://www.dragonwins.com/courses/ECE1021/STATIC/REFERENCES/... here.)


I'm not sure why syscalls would be UB; it's just not something defined by the C standard.

Edit: To clarify, I meant UB in the sense it is typically used in these discussions, where the standard more-or-less explicitly says "If you do X, the behavior is undefined." Not in the literal sense of "ISO C does not say anything about write(2), hence using write(2) is undefined behavior according to the C standard", which seems like a rather tautological and useless statement to me.


What do you think UB is if not something where the behaviour is not defined?


About your edit:

> "ISO C does not say anything about write(2), hence using write(2) is undefined behavior according to the C standard", which seems like a rather tautological and useless statement to me.

That is actually not so useless at all: if you try to compile and link a program that declares and calls a function but does not define it, you will typically get a linker error about an unresolved reference. If the name matches a non-ISO C library function, however, the implementation cannot know whether your program is in error or whether you want to use that library function, and will usually accept it. For this reason, the C standard does actually make it clear that using write(2) is UB to make it clear that implementations are not required to diagnose that as an error.


UB, in this context, is very explicitly used in the standard: it is undefined behavior related to a construct that the standard describes.


> behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

That is literally the definition of UB from the C standard. It is explicitly also about constructs that the standard does not describe. That makes sense: the standard does not and cannot define the behaviour for any construct not in the standard, so cannot impose any requirements for such constructs, and that is all UB is: something where the standard imposes no requirements.


The relevant discussion about UB is restricted to constructa that the standard describes. For example, writing past the end of object is UB - the construct is described in the standard, but is given no semantics by the standard.

The standard does not describe pattern matching, so using pattern matching is also undefined behavior, but there is nothing to be talked about here.


The comment I replied to did talk about something not described by the standard though, namely syscalls. If you want to argue that we should not be talking about syscalls here, your issue should be with the original comment that brought them up (https://news.ycombinator.com/item?id=27222325), not with my reply, I think. However, that comment looks perfectly fine to me. Also, depending on how the syscalls are made, it actually may be explicitly described as UB by the standard, see my comment https://news.ycombinator.com/item?id=27228701 too.


Syscalls are not any more UB than any other function call, though. Whether talking about write(2) or my_foo(), the call has the semantics given by the function signature visible in the current translation unit. Sure, the C standard doesn't define what write(2)'s effects will be, but that does not mean that calling it is UB according to the standard.

If the function has not been declared by the time it is first used, even then calling it is not UB - it is defined to be a compilation error (in versions earlier than C99 it was actually valid, but UB if the call did not match the actual function definition).


> Sure, the C standard doesn't define what write(2)'s effects will be, but that does not mean that calling it is UB according to the standard.

Yes, it does. I already explained exactly why it needs to be UB, but let me quote where the standard says so:

C99 6.9 External definitions:

> Semantics:

> An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.

If your program provides a declaration of write() and uses it without also providing a definition, the program does not have "exactly one external definition for the identifier", it has zero definitions for the identifier. This violates a "shall" that appears outside of a constraint, for which we turn to:

C99 4 Conformance:

> If a "shall" or "shall not" requirement that appears outside of a constraint is violated, the behavior is undefined.


> let me quote where the standard says so:

Wouldn't this hinge on what precisely "entire program" means? A definition for write(2) may not appear in the source code you wrote, but if "entire program" includes e.g., libraries dynamically linked in then it's quite feasible for the end result to be fully defined.

For example, 5.2.2 Paragraph 2 starts with (emphasis added):

> In the set of translation units and libraries that constitutes an entire program


Sure, but in the situation we were talking about, the user never wrote a definition for write(), and the user did not specify any library to include that provided a definition of write(). From the standard's perspective, that means there is no definition for it in the entire program.

Keep in mind that the standard's perspective is somewhat different from how things work in practice. We know that on Unix-like systems, there is also the concept of libraries, somewhat different from how the standard describes it, and write() will be provided by the "c" library. But consider the following strictly conforming program:

  #include <stdio.h>
  void write(void) {
    puts("Hello, world!");
  }
  int main(void) {
    write();
  }
A confirming C implementation is not allowed to reject this for a duplicate definition of write(): the name "write" is reserved for use by the programmer, it is not reserved to the implementation. This program must be considered not to violate the "there shall be exactly one external definition for the identifier", so the only way to consider this valid is to say that the implementation does not implicitly provide an external definition of the write() function as far as the C standard is concerned.

Yet at the same time, from the perspective of the implementation, the c library is considered to provide a definition of the write() function, but it is a definition that is only used if the program does not override it with another definition that should be used instead. This concept of multiple definitions for the same name, with rules specifying which of the multiple definitions gets picked, is very useful but is also beyond the scope of the C standard. When we say that a function is defined, we need to be clear on whether we use "define" in the ISO C sense or in some other sense. As your comment shows, things get very confusing if we are not careful with that.


> and the user did not specify any library to include that provided a definition of write()

Ah. I had assumed that that was implicit in "using write(2)", but seems that was a bad assumption.

> there is also the concept of libraries, somewhat different from how the standard describes it

In what way?

You make an interesting point with the example. It's not something I had considered before. Would weak linkage (or a similar mechanism that allows for a provide-unless-the-user-already-did-so type of behavior) fall under an implementation extension, then?


> In what way?

For the most part the standard does not address the existence of libraries other than the standard library, but 5.1.1.1 contains "Previously translated translation units may be preserved individually or in libraries." This, to me, suggests that from the standard's perspective, when you link in a library, you simply get that library, whereas on Unix systems, when you link in a static library, you specifically get those object files from the library needed to resolve not yet defined references, and when you link in a shared library, you get something where it becomes possible to have duplicate definitions where rules come into play as to which definition will end up used.

> You make an interesting point with the example. It's not something I had considered before. Would weak linkage (or a similar mechanism that allows for a provide-unless-the-user-already-did-so type of behavior) fall under an implementation extension, then?

Yes, I think so. Shared libraries implicitly have some sort of weak linkage already aside from the explicit weak linkage that you can get with e.g. GCC's __attribute__((weak)), but both forms count as extensions, I would say.


So is everything UB since all hardware isn’t perfect?


So if it is behavior that is not defined by the C standard, would that not make it undefined behavior?


There is a difference between jargon in context and the use of those words in a general sense. It can be "undefined behavior" in a general sense, but not necessarily "undefined behavior" in the jargon sense.

After all, if I were to use the words "undefined behavior" in a sentence unrelated to the standards, the definition in the standard of "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements." would be nonsense. Same goes in the other direction.


While technically correct, “undefined behavior” in terms of C and C++ refer to what the standard calls out explicitly as undefined, and not a simple “it’s not referenced, therefore it’s undefined.”

For example, signed(?) integer overflow is explicitly undefined by the standard, but as @formally_proven said, just because write(2) isn’t mentioned doesn’t mean usage of it is undefined.


Actually, that's exactly what it means:

> If a "shall" or "shall not" requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words "undefined behavior" or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe "behavior that is undefined".

write() is a function, and a call to it behaves like a function call, but the C standard says nothing about what that function does. You could have a function named "write" that writes 0xdeadbeef over the caller's stack frame. Of course if "write" is the function defined by POSIX, then POSIX defines how it behaves.


> but the C standard says nothing about what that function does

I'm pretty sure I'm just bad at searching through the standards document, but does the Standard actually define the precise semantics of function calls? 6.2.2 is about the function calls and the result thereof, but doesn't seem to be quite as precise about the semantics as I might expect.


No. "Implementation defined" says "the standard doesn't specify what happens here but the compiler must document what it does". That's a step removed from "the compiler may define what this does".


Neither system calls nor the asm keyword are undefined behavior in the sense that C uses the term. They are, simply put, not covered by the standard at all.

System calls--assuming you're referring to the C prototypes you call--work as normal external function definitions, just having semantics which are defined by the library (i.e., the kernel) and not the C specification itself. The asm keyword is a compiler language extension and is effectively implementation-defined (as C would call it), although compilers today tend to poorly document the actual semantics of their extensions.


The thing about UB is that it tends to happen when the C standard refuses to specify when a program segment is erroneous or valid. Some C environments treat memory as a large array of undifferentiated bytes or words, by design. Other C environments have tagged, bounds-checked regions of memory, again by design. (For example, the C compiler for the Lisp machine.) Usually, indirecting through a null pointer or walking off the end of an array are erroneous, but sometimes you want to read from memory location 0, or scan through all of available memory. The C standard allows for both kinds of environments by stating that these behaviors are undefined, allowing the implementation to error out or do something sensible, depending on the environment.

The idea that UB is carte blanche for implementations to do whatever is an unintended consequence of the vague language of the standard. Maybe a future C standard should use "safe" and "unsafe" instead of UB for some of these operations, and clarify that unsafe code will be erroneous in a safe environment and do something sensible but potentially dangerous in an unsafe environment so you must really know what you're doing.


> The idea that UB is carte blanche for implementations to do whatever is an unintended consequence of the vague language of the standard.

Whether or not this was originally intended, it's certainly become the way the standard is written and used today, so that's kind of beside the point.

Further, this is not some new idea that arose from the C standard. It's a basic, core idea in both software engineering and computer science! You define some meaning for your input, which may or may not cover all possible inputs, so that you can go on to process it without considering inputs that don't make sense.

Now, to be fair, the "guardrail-free" approach where UB is silent is a bit out of the ordinary. A lot of software that makes assumptions about its input will at least try to validate them first, and a lot of programming language research will avoid UB by construction. But C is in a unique place where neither of those approaches fully work.

> The C standard allows for both kinds of environments by stating that these behaviors are undefined, allowing the implementation to error out or do something sensible, depending on the environment.

This is true, but it doesn't mean that "something sensible" is actually something the programmer should rely on! That's just asking too much of UB- programmers need to work with the semantics implemented by their toolchain, not make up an intuitive/"sensible" meaning for their undefined program and then get mad when it doesn't work.

For example, if you want to scan through a bunch of memory, tell the language that's what you're doing. Is that memory at a fixed address? Tell the linker about it so it can show up as a normal global object in the program. Is it dynamic? Memory allocators fabricate new objects in the abstract machine all the time, perhaps your compiler supports an attribute that means "this function returns a pointer to a new object."

The solution is not just to shrug and say "do something sensible but potentially dangerous." It's to precisely define the operations available to the programmer, and then provide tools to help them avoid misuse. If an operation isn't in the language, we can add it! If it's too easy to mess up, we can implement sanitizers and static analyzers, or provide alternatives! Yelling about a supposed misreading of "undefined behavior" is never going to be anywhere near as effective.


One issue is that under the prevailing interpretation, the existing semantics is not reliable. You do not know when or if the compilers will take advantage of UB to completely change the semantics they are providing. That's not tenable.


That's not how it works. Taking advantage of UB doesn't change the semantics, it just exposes which behaviors were never in the semantics to begin with. Barring compiler or spec bugs, we do in principle know exactly when the compiler may take advantage of UB. That's the point of a document like the standard- it describes the semantics in a precise way.

To be fair, the existing semantics are certainly complex and often surprising, and people sometimes disagree over what they are, perhaps even to an untenable degree, but that's a very different thing from being unreliable.


The net result of your argument is the language has no semantics. I write and test with -O0 and show that f(k)=m. Then I run with -O3 and f(k)=random. Am I required to be an expert on C Standard and compiler development in order to know that, with no warning, my code has always been wrong? What about if f(k)=m under Gcc 10, but now under Gcc 10.1 that whole section of code is skipped? What you are asking programmers to do is to both master the fine points of UB (which is impractical) and look into the future to see what changes may be invisibly committed to the compiler code.


> What you are asking programmers to do is to both master the fine points of UB (which is impractical) and look into the future to see what changes may be invisibly committed to the compiler code.

I am asking programmers to understand and avoid UB, but I am not asking them to look into the future. Future compilers will still implement the same semantics- that's, again, the point of having a spec!

I don't disagree that avoiding C's UB unaided can be difficult, but that just means the solution is to make it easier- and that's exactly what I suggested above: "precisely define the operations available to the programmer, and then provide tools to help them avoid misuse."

And this isn't a new idea. People have been making progress in this area for a long time: better documentation of the rules, sanitizers, static analyzers, changes to the spec to remove some forms of UB, new languages that reshuffle things to make it harder or impossible to invoke UB, etc.


What indicates the the current semantics won't change next week? After all, it is completely up to the compiler, no?


Are you serious? Again: it's not up to the compiler at all, it's up to the spec. The spec indicates the current semantics won't change next week.

The semantics for non-UB code are completely fixed across optimization levels and compiler versions. Only code that invokes UB can break on these changes, and only because this code never had any specified semantics to begin with.


It is impossible to write C applications without invoking UB and your theory that UB behavior had no semantics is nonsensical. It may have no semantics that compilers currently feel they need to keep stable, but code that compiles and runs has semantics.


That's not what I (or the standard, or compiler writers, or programming language researchers) mean by "semantics."

From the perspective of defining and specifying a programming language, when we say "semantics" we mean the set of rules for an abstract machine, or a similar formalism. If those rules don't specify the result of an operation, it's like the machine gets stuck- like dividing by zero in a proof, there is no correct way to proceed. The behavior is undefined.

Nobody is disputing that compilers will produce something for code with undefined behavior. They're just saying that there is absolutely no useful way to rely on it, because nobody has agreed on or even decided what it should be (and always for some specific reason!)

If that makes it impossible to use the language, that's not UB's problem. It's the design of the language, and the quality of the tools that surround it. There are ways to increase your confidence that a C application never invokes UB, and they're getting better all the time. (There are also lots of new languages that try to solve this in various ways that C can't!)

Those are the solutions we have. "Just don't have UB in C" or "just make compilers more predictable" are not very effective by comparison.


> all of the tricks which make "malloc" work,

What are those, exactly? AFAIK, you can safely track memory addresses by storing them as intptr_t/uintptr_t.


The C standard says very little about how those types work. In particular, you can cast a pointer to one of them and then cast back to a pointer -- but only if you cast the exact same value back, and the intptr values are not guaranteed to be in any way meaningful.

In particular, casting a pointer to intptr_t, doing arithmetic on it, and casting back is not guaranteed to do anything useful. It almost certainly will, since most systems treat it as roughly the same as casting to char *, but the standard does not guarantee it.


> and casting back is not guaranteed to do anything useful

I believe it's implementation-specific, precisely so that malloc/free can be implemented in conformant C.

The only tricky part is how you can "bless" parts of large memory block originally pointed to by void* (returned from mmap(), for example) to safely become ints and char[]s and structs...


Do you have an example of a situation in which you'd want to cast the result of arithmetic intptr_t values to a pointer? The situations I can think of off the top of my head would be better done as arithmetic between pointers.


Arithmetic on pointers in turn is only defined if the pointers point within the same object (or right past the end of that object).

One example of using intptr_t would be going from a pointer passed to free() to a metadata block for the memory that must be freed.


Oh, for instance, on some implementations there is a lot of interesting stuff just prior to the allocated block returned. Not exactly the pinnacle of elegance but it gets the job done.


The C standard is not the Bible, it is not written by an almighty god. As respectable K&R are, they are humans who wrote a standard for their own needs, based on the state of the art of that time. Sadly C is the work of mortals...

Reading the standard trying to understand the way of god like religious scholars do is a pointless exercise. Modern compiler developers found that exploiting undefined behavior the way we do new leads to interesting optimization, others found it reasonable so now it is the standard.

I think the issue most people have now is that compilers use advanced solvers that are able to infer a lot from undefined behavior and other things, so UB is no longer just "it works or it crashes".


I don't see any utility in inventing a new reading of the standard. Getting everyone to agree on a new interpretation of a sentence can't possibly be easier than getting everyone to agree on a more clearly worded sentence. The actual thing you'd have to convince everyone of (the utility of the new consensus) and the people you'd have to convince (compiler writers, documentation authors) are the same in both cases.


The difference is that, once decided and written, no party (both old and new) can’t chime in with a new interpretation.


> The difference is that, once decided and written, no party (both old and new) can’t chime in with a new interpretation.

That double negative ("no party … can't") was accidental, right?


I think the reasoning here is flowing backwards.

The writer wants to believe that C is a well-designed language suitable for writing large programs (because programmers understandably use it that way; there's not really an alternative to C), and so people reading the spec and finding a minefield _must_ be reading the spec wrong. So many important programs are written in C, and so many of them, with a very strict reading of the C standard, can hit cases where their behavior is "undefined". This _is_ scary, if the C-language-lawyers are right!

The C language was originally largely descriptive, rather than prescriptive. Early "C" compilers disagreed on what to do in strange cases (e.g., one might wrap integer overflow, one might saturate, one might have wider ints). Even when using the less-chaotic "implementation defined behavior", behavior can still diverge wildly: `x == x + 1` is definitely `false` under some of those interpretations and maybe `true` in some of those interpretations.

However, the C spec clearly says that the compiler may "ignore the situation" that "the result is ... not in the range of representable values for its type"; it is "permissible" that `x == x + 1` is replaced with `false` despite the "actual" possibility that adding 1 to x produces the same value, if `+` was compiled to be a saturating add.

This has significant practical consequences, even without the "poisoning" result commonly understood of undefined behavior. Since the value is known statically to be `false`, that might be inlined into a call into a function. That function may _dynamically_ re-check `x == x + 1` and find that it is `true`; obviously that function doesn't have a `if (true && false) {` case, so it results in the function misbehaving arbitrarily (maybe it causes a buffer overrun to the argument of a syscall!).

'Intuition' does not make a programming-language semantics. You need to write down all the rules. If you want to have a language without undefined behavior, you need to write down the rules for what must happen, keeping in mind that many examples of undefined behavior, like dereferencing out-of-bounds pointers, _cannot_ be detected dynamically in C without massive performance costs. To detect if a pointer is out-of-bounds, you need to always pair it with information about its provenance; you need to track whether or not the object has been freed, or the stack-frame it came from has expired. Is replacing all pointers with fat-pointers indicating their provenance and doing multiple comparisons before every dereference the "right" way to compile C?


> The writer wants to believe that C is a well-designed language suitable for writing large programs [...], and so people reading the spec and finding a minefield _must_ be reading the spec wrong.

I think this is the core of the problem. There are languages that provide much more detailed, predictable behaviors, and the author wants to avoid moving to one.


This is interesting. Humans interpret standards and compilers interpret code, too :)

We have the benefits of a standard, portability and multiple compilers. But it also comes with duties.

BTW.

The C++ people seem to tackle some of the issues around widespread 'undefined behavior' and defined a lot of behavior:

  1) If a side effect on a scalar object is unsequenced   relative to another side effect on the same scalar object,   the behavior is undefined.
  i = ++i + 2;       // undefined behavior until C++11
  i = i++ + 2;       // undefined behavior until C++17
  f(i = -2, i = -2); // undefined behavior until C++17
  f(++i, ++i);       // undefined behavior until C++17, unspecified after C++17
  i = ++i + i++;     // undefined behavior
  2) If a side effect on a scalar object is unsequenced relative to a value computation using the value of the same scalar object, the behavior is undefined.
  cout << i << i++; // undefined behavior until C++17
  a[i] = i++;       // undefined behavior until C++17
  n = ++i + i;      // undefined behavior
Source: https://en.cppreference.com/w/cpp/language/eval_order

They replaced the entires "Sequence Point Rules" by "Sequenced-before rules". Compiler implementers cannot choose what to do (implementation defined) or neglect the issue (undefined behavior) - the must act well defined now in many situations.


Reading error or not, the ship has sailed long ago. As the article notes, C99 formalized the current UB interpretation. What C89 or K&R said is more just historical curiosity than of any real relevance today. I guess you could construct an argument that gcc should disable some optimizations when invoked with -std=c89, but I doubt anyone really cares at this point enough to justify the maintenance burden.

C is a minefield today, and that is the reality we must live with. You can't turn back time 25 years and change what has happened.


> C is a minefield today, and that is the reality we must live with. You can't turn back time 25 years and change what has happened.

I think it's doable to make some of the UB issues considerably less painful. E.g. define facilities to check for overflow safely, add facilities for explicit wrapping operations.


> C can accommodate significant optimization while regaining semantic coherence – if only the standard and the compilers stop a lazy reliance on a mistaken reading of “impose no requirements”.

That wouldn't be enough, because, sadly for us, rewriting history is only an option in Git repositories and speculative fiction. On this timeline, it doesn't much matter how the standard should have been interpreted (I'll refrain from opining on that), what matters is how the standard was interpreted, and its influence on the behavior of major compilers.

Given the current situation, it seems to me that reaping the benefits of such an enterprise would require getting everyone on board, a mode that's based on the new interpretation, leaving it turned off by default, and then basically never actually using it for fear of breaking compatibility with toolchains that have to use an older version of the compiler for a good decade or two while we wait on them to age out of existence.

I'm not sure the world actually has much practical use for an unapproachable ivory tower dialect of C. Personally, I'd much rather have a go at a language like Zig that's taken upon itself to dream even bigger.


> So, “i << 32” buried in a million lines of database code makes the entire program junk – supposedly as an “optimization”!

I think the author has it wrong here because they’re assuming lines and order has any meaning when the program is compiled all together. Imagine a compiler where (I know that this is not actually possible to do) undefined behavior was a compiler error.

You would sound crazy for complaining that “a syntax error in like 4325 make the whole program uncompileable.”

It’s not like running C programs have any idea that they are hitting undefined behavior. It’s just that the generated assembly is allowed to assume that you know what you’re doing.

    i << runtime_val
Can just generate the shift without worry that it will blow up and the propagate the knowledge up that runtime_val’s domain is [0,31] for the optimizer.


I'm afraid the author themselves is misreading the definition of undefined behavior. Undefined behavior is not "behavior upon use of a nonportable or erroneous program construct". That rephrasing completely changed the meaning. Undefined behavior is, as the C standard states, "behavior [, ...,] for which the Standard imposes no requirements". The whole ", upon use of...," part is just exemplifying situations in which undefined behavior can occur. The standard will sometimes say that a certain construct results in undefined behavior but more importantly any construct for which the standard does not specify a certain behavior has (what else?) undefined behavior.


True. I think the point of the author is that the C standards group is not doing their job by leaving so much room for compilers to interpret undefined behavior.


I think one of the best attempts to solve this is the attempt to classify undefined behavior into "bounded" and "critical" UB, which is a distinction that hasn't gained as much traction as I'd like.

Something like 1<<32 is "bounded" UB, and can't result in anything happening to your program... it is permitted to result indeterminate values, it is permitted to trap, but that's basically it.

Critical UB is stuff like writing to a const object, calling a function pointer after casting it to an incompatible type, dereferencing an invalid pointer, etc. Basically, anything goes.

This is part of the "analyzability" optional extension which I wish would gain more traction.


> Something like 1<<32 is "bounded" UB, and can't result in anything happening to your program... it is permitted to result indeterminate values, it is permitted to trap, but that's basically it.

That is what implementation-defined behavior is for. Feel free to advocate for changing the standard accordingly.


> Feel free to advocate for changing the standard accordingly.

I am, in fact, describing the existing C standard, not advocating for changes. Please refer to Annex L "Analyzability", which defines the terms I used: "bounded undefined behavior" and "critical undefined behavior". Note that this section is conditional... it is not widely adopted. I am advocating for increased adoption of this part of the standard, or barring that, revisions that would make adoption more palatable.

And in general, "If you don't like it, advocate changing the standard" is not a useful or insightful response. It is completely reasonable and normal to complain about something without trying to fix it.


You're right, apologies.


Let‘s just agree that C is not a good programming language according to modern standards: basic integer types are a complete mess with varying sizes and confusing implicit casts. The advantages of null terminated strings do not matter anymore, but the disadvantages matter more and more. The compilation model with headers is a mess compared to module systems. The C Preprocessor is an abomination. The complexity of the language is surprising.


Also relevant:

What every compiler writer should know aboutprogrammers or “Optimization” based on undefined behaviour hurts performance

by M. Anton Ertl

https://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_20...


Thank you for linking this paper! I was trying to find it and link it myself.

I have found that it is true, actually. I had to write [1] to get portable sane behavior with signed arithmetic, and it obviously will slow my code down.

Exploiting UB hurts conscientious programmers who actually try to avoid UB.

[1]: https://git.yzena.com/Yzena/Yc/src/branch/master/include/yc/...


Commenters here seem to be missing the core thesis of this article. It's not about what the standard literally means; it's about it's spirit -- and the reason for its spirit.

The issue is "undefined behaviour" should never have been interpreted this extremely. The standard may be silent on how extreme, but it is implausible to suggest that the standard was actually written to enable this.

Compiler writers for C! dismiss severe issues which occur in compiling any program with undefined behaviour; issues which would render any modern language a bad joke.

Compiler writers are using this as a perverse shield to simply fail to optimize for correctness, or provide means to; and to enable them to optimize only for performance.

Are we really saying the standard supports this sleight-of-hand? It seems more like using the Second Amendment to murder someone.


Personally, I think that arguing that those who define and implement a standard don’t understand one of the most fundamental aspects of said standard is going to be an uphill battle.

You could argue that they’ve lost their way, and the article flirts with this, but the path forward is the hard part, and IMHO rings a bit hollow: it’s asserted that these rules aren’t needed for performance, but no evidence is given, and what similar evidence we do have (compiling on lower optimization levels) doesn’t seem to support this thesis. You could argue that, the kernel, which turns off aliasing, is plenty performant without it, and that’s a decent argument, but it’s not clear that it wouldn’t be even faster with it, and it’s much harder to empirically test this than removing the flag, since it will miscompile.


Different code depends on different optimizations. A loop on an int** might benefit a lot from aliasing optimizations, because the compiler will assume that a[i] will remain the same after writing to a[i][j]. Other code may not benefit at all.

Likewise that loop may not benefit from signed overflow; instead an initialization loop that, by way of multiple level of macros, ends up doing a[i]=b[i]*1000/100, might become twice as fast if signed overflow rules let the compiler rewrite the assignment as a[i]=b[i]*10.


Wang et al tried this an experiment and found no serious wins.


For reference, [0] appears to be the referenced paper. The relevant passage:

> To understand how disabling these optimizations may impact performance, we ran SPECint 2006 with GCC and Clang, respectively, and measured the slowdown when compiling the programs with all the three -fno-* [-fno-strict-overflow, -fno-delete-null-pointer-checks, and -fno-strict-aliasing] options shown in Figure 9. The experiments were conducted on a 64-bit Ubuntu Linux machine with an Intel Core i7-980 3.3 GHz CPU and 24 GB of memory. We noticed slowdown for 2 out of the 12 programs, as detailed next.

> 456.hmmer slows down 7.2% with GCC and 9.0% with Clang. The first reason is that the code uses an int array index, which is 32 bits on x86-64, as shown below.

    int k;
    int *ic, *is;
    ...
    for (k = 1; k <= M; k++) {
        ...
        ic[k] += is[k];
        ...
    }
> As allowed by the C standard, the compiler assumes that the signed addition k++ cannot overflow, and rewrites the loop us- ing a 64-bit loop variable. Without the optimization, however, the compiler has to keep k as 32 bits and generate extra in- structions to sign-extend the index k to 64 bits for array access. This is also observed by LLVM developers [14].

> Surprisingly, by running OProfile we found that the most time-consuming instruction was not the sign extension but loading the array base address is[] from the stack in each iteration. We suspect that the reason is that the generated code consumes one more register for loop variables (i.e., both 32 and 64 bits) due to sign extension, and thus spills is[] on the stack.

> If we change the type of k to size_t, then we no longer observe any slowdown with the workaround options. 462.libquantum slows down 6.3% with GCC and 11.8% with Clang. The core loop is shown below.

    quantum_reg *reg;
    ...
    // reg->size: int
    // reg->node[i].state: unsigned long long for (i = 0; i < reg->size; i++)
    reg->node[i].state = ...;
> With strict aliasing, the compiler is able to conclude that updating reg->node[i].state does not change reg->size, since they have different types, and thus moves the load of reg->size out of the loop. Without the optimization, however, the compiler has to generate code that reloads reg->size in each iteration. If we add a variable to hold reg->size before entering the loop, then we no longer observe any slowdown with the workaround options.

> While we observed only moderate performance degradation on two SPECint programs with these workaround options, some previous reports suggest that using them would lead to a nearly 50% drop [6], and that re-enabling strict aliasing would bring a noticeable speed-up [24].

[0]: https://pdos.csail.mit.edu/papers/ub:apsys12.pdf

[6]: https://lists.gnu.org/archive/html/autoconf-patches/2006-12/...

[24]: (dead link, doesn't appear to be available on the Wayback Machine) https://www.linaro.org/blog/compiler-flags-used-to-speed-up-...


> If we change the type of k to size_t, then we no longer observe any slowdown with the workaround options

Basically in that case the benefits of the optimization disappears once you fix the code.


I mean, that's kind of a tautological statement; if you change the code to either eliminate the need for the optimization or manually implement it, of course the benefits of the optimization disappear. That would apply to most, if not all, optimizations.


The spirit of the article and your comment seems to underrate the efforts required to come up with something like the standard for a C compiler.

It's not a "perverse shield". In C, "optimize for correctness" is the programmer's job. I don't want to sound dismissive, and I don't pretend you to write a compiler, but please tell me how would you word a global standard for generating compilers that can compile for hundreds of architectures, with the main focus of being portable and generating the most performant code (no runtime checks).

You might come up with a standard similar to something like Rust (which does not have an standard yet, and has only a single compiler "brand"). But Rust is super-difficult to understand and part of its safeness depends on runtime checks, including runtime checks on ownership like RefCell, impacting on performance. The same goes for Ada and other runtime-checked languages.

Or you'll end up with something like a standard for C, in which you assume the programmer can be somewhat trusted allowing the compiler to generate faster code, and having to deal with little caveats that depends on certain architectures.


I am sympathetic with the view that compiler writers are trapped by the architectures they have to support.

However, I think there's just a philosophical difference here about what compilers are for: are they responsible for emitting correct programs; or are they for literal transpiling?

Regardless of the history which has led to the latter outcome for C, I have to agree with the author that I think K&R's original vision wasn't quite so accommodating.

Perhaps ignorantly, I just have to imagine there is something a little better than the present attitude.

Could there not be a --hault-on-all-undef with a serious attempt to realise that in as many respects as possible?


> emitting correct programs; or are they for literal transpiling?

I guess the former. I am sure you are familiar with the "garbage-in, garbage-out" concept. The rules are set in the standard. If you feed the compiler with code outside the rules of the standard (even if syntactically correct), you cannot expect correct programs 100% of the time. The standard (somehow) guarantees that code that follows the rules should translate into correct programs. Some of the rules dictate the compiler to "do this", other dictate "do something but specify what you are doing" and "do whatever you want, including exploding, I don't care". Many of these rules are this way for the sake of generating optimized code.

These are the rules.

Are these rules too many to remember? Unfortunately yes, they are.

> I just have to imagine there is something a little better than the present attitude.

My crazy hope is that someday runtime checks will be done at silicon level. That is, compiler emits instructions declaring "this is an array of this dimension", and every time this array is accessed with the proper instructions (and the CPU is somehow aware of this), it triggers an interrupt when accessed out of bounds, for example.

> Could there not be a --hault-on-all-undef with a serious attempt to realise that in as many respects as possible?

It might not be possible for every piece of code. It will have to stop the compilation (or execution with runtime checks) at every int increment/operation. So I think that if rules cannot be generally applicable then they shouldn't be applicable at all.


Two points:

The notion that compilers that encounter undefined behavior are allowed to generate any code they want is a new interpretation, for some value of "new". I can't remember the first time I encountered such an interpretation being used by compiler writers to justify something they wanted to do until sometime after 2000.

The notion that John Regehr has (quoted in the article) that undefined behavior implies the whole execution is meaningless is not supported by the language of either the C89 or C99 standard, at least by my reading. The C89 standard has a notion of sequence points. Wouldn’t all sequence points executed before undefined behavior is encountered be required to occur as if the undefined behavior wasn’t there? It would seem so:

From the C89 standard: 2.1.2.3 Program execution

The semantic descriptions in this Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant. Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.

The C99 standard has nearly identical language:

5.1.2.3 Program execution 1 The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant. 2 Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects,11) which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.


> Wouldn’t all sequence points executed before undefined behavior is encountered be required to occur as if the undefined behavior wasn’t there? It would seem so:

No. Code optimization is a series of logic proofs. It is like playing Minesweeper. If a revealed square has 1 neighboring mine and a count of 1, then you know that all 7 other squares are safe. In other Minesweeper situations you make a proof that is much more complex and allows you to clear squares many steps away from a revealed mine. If you make a false assumption of where a mine is, via a faulty proof, then you explode.

The compiler is exactly like that. "If there is only one possible code path through this function, then I can assume the range of inputs to this function, then I can assume which function generated those inputs..."

You can see how the compiler's optimization proof goes "back in time" proving further facts about the program's valid behavior.

If the only valid array indexes are 0 and 1 then the only valid values used to compute those indexes are those values that produce 0 and 1.

This isn't even program execution. In many cases the code is collapsed into precomputed results which is why code benchmarking is complicated and not for beginners. Many naive benchmark programs collapse 500 lines of code and loops into "xor eax,eax; ret;" A series of putchar, printf and puts calls can be reduced to a single fwrite and a malloc/free pair can be replaced with an implicit stack alloca because all Standard Library functions are known and defined and there is no need to actually call them as written.


The standard (in the prevailing reading of the UB section, and also in practice) places no requirements on the behavior of programs containing UB. None of the paragraphs you quoted have any bearing on how an UB-laden program behaves.


No, sequence points aren't relevant because they occur at runtime. "Time traveling UB" happens in the compiler, typically in optimization passes and can cause otherwise valid code to exhibit completely different behavior than it would if the UB didn't exist.


I think the real problem stems from the mismatch between modern processors and the processors C was originally designed for.

C programmers want their code to be fast. Vanilla C no longer gives them the tools to do that on a modern processor. Either the language needs to be extended or the compiler needs to get more creative in interpreting the existing language. The latter is the least disruptive and it doesn't stop judicious use of the former.

So, in short, UB is what gives room for the compiler to be more creative without the programmer having to change their code. It wasn't a reading error, it was an opportunity the compiler devs enthusiastically embraced.


When compiler writers have get "creative" with C undefined behavior, programming C no longer produces predictable results.

> least disruptive

Like starting to optimize away loop checks that can "never happen" because signed integer overflow is UB, suddenly changing the behavior of programs that were fine for years?

I wish I could just fence off this insanity by never starting another project in C. Unfortunately, C is ubiquitous in the ecosystem so all of us are stuck cleaning it up.


> Like starting to optimize away loop checks that can "never happen" because signed integer overflow is UB, suddenly changing the behavior of programs that were fine for years?

Yeah. Not doing that on modern processors is actually quite disruptive.

Here:

    for(i = offset; i < (offset + 16); i++) {
        arr[i] = i + 32;
    }
What C compilers currently do is, in line with the standard, ignore the case that offset + 16 might overflow. This makes this eligible for loop unrolling, and depending on the specifics of the math inside the loop, the compiler can do a lot to pre-calculate things because it knows this is happening 16 times exactly.

If, instead, we force compilers to think about the fact that offset + 16 could have some implementation-defined meaning like wrapping, then all bets are off & we have to throw a bunch of optimization opportunities out the window.

Lots and lots of hot & tight loops which are currently able to be compiled into something suitable for the preferences of modern CPUs instead has to be to be naively compiled because of the need to hold back due to the possibility of something that largely wasn’t happening, happening.

Most people write most loops this way, never expecting or intending to overflow anything. Most loops are benefitting from this optimization. A lot of code would get slower, and programmers would have to do a lot more fragile hand-unrolling of operations to get that performance back. And they’d need to update that more often, as whatever the optimal “stride” of unrolling changes with the evolution of CPU pipelines.

It’s slower code and more work more often for more people, to satisfy a minority use-case that should really just have its own separate “please wrap this” construct.


Well-defined integer overflow would not preclude loop unrolling in this case. One simple alternative would be for the compiler to emit a guard, skipping unrolling in the case that (offset+16) overflows. This guard would be outside the unrolled loop. Furthermore, unsigned values are often used for indices (the unsigned-ness of size_t pushes programmers in that direction) and unsigned overflow is well-defined, so any compiler engineer implementing unrolling should be be able to emit such a guard so that the optimization can be applied to loops with unsigned indices.


> Well-defined integer overflow would not preclude loop unrolling in this case. One simple alternative would be for the compiler to emit a guard, skipping unrolling in the case that (offset+16) overflows.

To what end?

        for(i = offset; i < (offset + 16); i++) {
            arr[i] = i + 32;
        }
like most loops in most programs isn't designed to overflow. The program isn't any more correct for emitting two translations of the loop, one unrolled and one which is purely a bugged case anyways.

Changing the way the UB manifests while altering the nature of the optimization hasn't actually fixed anything at all here. All this would seem to accomplish would be to increase pressure on the icache.


It's not designed to overflow, and automobiles are not designed to crash, but airbags are good engineering anyways


When loops that aren’t designed to overflow cheerfully wrap, they mostly just execute millions of unintended iterations reading or writing before the beginning of arrays they were indexing.

That’s not an airbag, that’s a crumple zone that rams the steering column through your sternum to avoid the engine crushing your legs. You’re still dead, we’ve just uselessly shuffled the details around.


> If, instead, we force compilers to think about the fact that offset + 16 could have some implementation-defined meaning like wrapping, then all bets are off & we have to throw a bunch of optimization opportunities out the window.

Uh huh. If `i` is declared as `unsigned int` instead of `int`, then overflow is defined and the compiler can't apply those optimizations. And yet the world doesn't end and the sun will still rise tomorrow...


The world doesn't end, but in the "int" case you get nice vector code and in the "unsigned int" case you get much less nice scalar code: https://gcc.godbolt.org/z/cje6naYP4


Yes, that is true. The proper way for a compiler to handle this, would be to add a single overflow check before the loop, which branches to a scalar translation of the loop. Most realistic code will need a scalar version anyway, to deal with the prolog/epilog of the unrolled loop for iteration counts that aren't multiples of the unrolling factor.

Surely you agree that treating unsigned overflow differently from signed does not make any sense semantically? Why is signed overflow UB, but unsigned wrapping, and not the other way around? The terms 'signed' and 'unsigned' denote the value range, not "operations on this type might overflow/will never overflow".


To a mathematician, wrapping 2^n+1 back to 0 is a lot more intuitive than wrapping 2^n to -2^n. Mathematically the two systems are largely equivalent. They are equivalent when considering addition and multiplication. Both implement arithmetic modulo 2^n+1.

However, the canonical representation of this system runs from 0 to 2^n+1. Hence, if you were going to make one kind of integer overflow, and not the other, C made the correct choice.

That leaves out the question of whether the difference between the two cases is significant enough to have a difference in how overflow works.


> The proper way for a compiler to handle this, would be to add a single overflow check before the loop, which branches to a scalar translation of the loop. Most realistic code will need a scalar version anyway, to deal with the prolog/epilog of the unrolled loop for iteration counts that aren't multiples of the unrolling factor.

That's true, I agree that that would be a clever way to handle this particular case. It would still happily invoke undefined behavior if the indices don't match the array's length, of course. Many assumptions about the programmer knowing what they are doing goes into the optimization of C code.

> Surely you agree that treating unsigned overflow differently from signed does not make any sense semantically?

Yes. Silently wrapping unsigned overflow is also very often semantically meaningless.


Clang uses vectors for both. https://gcc.godbolt.org/z/G997Ge9KT


Yes, with lots of extra ceremony around it. (More than is needed, since it doesn't seem to realize that it will always process exactly 16 loop iterations.)

Since you've posted a lot along the lines of "these optimizations don't even make a difference", you might want to see if Clang's safer-looking version is as fast as GCC's.


It's not an interesting optimization. Micro-benchmarks are of limited utility. The extra complication is to protect the code from flying off and writing on random memory. Well worth it.


> And yet the world doesn't end and the sun will still rise tomorrow...

No, you just get much slower, non-vectorized code because the compiler is forced to forgo an optimization if you use unsigned int as the loop bound (EDIT: tom_mellior's reply illustrates this extremely well: https://gcc.godbolt.org/z/cje6naYP4)

Which is precisely the point: forcing a bunch of existing code with int loop bounds, which currently enjoys optimization, to take on the unsigned int semantics and get slower, is just going to piss off a different (and probably larger) set of people than the "compilers shouldn't assume that unsigned behaviour can't happen" set of people.

It's a tradeoff with some big downsides; this isn't the obvious win the anti-optimization crowd pretends it is.


And switch the i to a size_t and get vector code without the possibility of writing to random memory because your int overflows and GCC wants to pretend it cannot.

This is a poorly written loop. C design model is that if it is not critical, we don't care, and if it is, the programmer should fix it so optimization can work. https://gcc.godbolt.org/z/ErMP4cn6s


You changed writes to indices offset..offset+15 to writes to indices 0..15.


the offset is used to compute the index, not the count.


But you're not using the offset to compute the index in bar().

    void foo(int offset, int *arr) {
        for(int i = offset; i < (offset + 16); i++) {
            arr[i] = i + 32;
        }
    }
If you call this with offset = 100, the arr[i] in the loop will write to arr[100], arr[101], ..., arr[115].

    void bar(unsigned int offset, int *arr) {
        if(offset+16 < offset)fail();
        for(size_t i = 0; i < 16; i++) {
            arr[i] = i+ offset + 32;
        }
    }
If you call this with offset = 100, the arr[i] in the loop will write to arr[0], arr[1], ..., arr[15].


Fixed it, but same kind of result

https://gcc.godbolt.org/z/hx1zjE5xW


Nice. If you like, could you explain again what the perceived difference is to adding

    if (offset >= INT_MAX - 16) fail();
in foo()?

(I mean, besides the fact that the size of the buffer pointed to by arr is highly unlikely to agree exactly with either INT_MAX or UNIT_MAX.)


I wanted to show that we don't need UB justified deletes to get good code generation. There was no need to break all that working code when we could have just told people that size_t counters worked better in loops on x86-64 than ints. A lot of C optimization could work that way - relying on cooperation between the compiler and programmers. Java can't do that because Java programmers rely on complex abstractions that need a lot of compiler work to run fast.


> A lot of C optimization could work that way - relying on cooperation between the compiler and programmers.

That is precisely how it works already. The reason your code has no bounds checks is exactly because the compiler can assume that you have done your part and ensured that all indices are in bounds. This is what "the compiler can ignore UB" is all about.

The signed integer kerfuffle is just the same: The compiler assumes your cooperation in ensuring, beforehand, that your signed arithmetic never overflows. Its part of the bargain is generating the best possible code it can. Another part of its bargain is offering you the -fwrapv flag to communicate more about your expectations. A third part of the bargain is offering you sanitizers that can inform you that you have done something you probably didn't want.


The problems with that argument are 1) The expectations changed with no notice. You can say it always was that way, but that's just not correct. The bounds check worked and then didn't, no matter what you think the standard "always said" (and the UB experts on the WG14 often find it impossible to say exactly what provisions mean, so claims that all this was ever clear are also wrong.) 2) deleting overflow check reduces the power of the language. The supposed work arounds are painful and have edge cases. 3) the example, and others, show that much UB "we assume it can't happen" "optimization" is unnecessary. You make the language more difficult to use, more prone to unpleasant surprise, and in return you provide an "optmization" that could easily be produced by other means. You're insisting on using a hammer as a fork and annoyed when people don't find it convenient.


> to satisfy a minority use-case

Every single C program is potentially in that "minority". Nobody can tell when the compiler writers are going to change up behavior on you.

It doesn't matter how carefully the codebase has been written, whether you've had `-Wall -Wextra` enabled. What was fine at one time is no longer fine today. Any C program may suddenly start exhibiting misbehavior from innocuous to catastrophic to horrendously insecure.

It's psycho, maddening, irresponsible. And the only way to deal with it is to purge C programs compiled by these psychotic compilers from our systems.


> Every single C program is potentially in that "minority". Nobody can tell when the compiler writers are going to change up behavior on you.

This is ridiculously hyperbolic, and bringing unthinking emotional responses like "psycho" and "irresponsible" only obscures the fact that there are very serious engineering tradeoffs involved in trying to balance "not surprising people whose code contains an assumption that some case is going to behave a certain way when by-the-standard-as-written that case can be ignored" and "not making everything with a hot loop 8x slower because we can't assume anything about loop bounds any more", and that compilers that do the latter are unlikely to prove popular with a lot of people either.


> minority use-case

The amount of code that looks like this in a big enough hot loop to make a difference is negligible. Can you provide even one real-world example where this makes a difference, i.e. not some microbenchmark? The amount of code that can break as a result of signed overflows being UB, on the other hand, is huge.

> programmers would have to do a lot more fragile hand-unrolling of operations to get that performance back

Much easier ways to do this, e.g. by using an assert wrapper around __builtin_unreachable. Alternatively, an unsafe_int_t could be defined that gives the optimize-able behavior. The important thing is to make it opt-in; sensible defaults matter.


> Can you provide even one real-world example where this makes a difference, i.e. not some microbenchmark

Sure. I don't even have to leave this thread to find one: https://news.ycombinator.com/item?id=27223954 reports a measurable speed impact to PostgreSQL when compiled with -fwrapv, which rules out the exact optimization in question.

This shouldn't be surprising; loops are extremely common and superscalar processors benefit enormously from almost anything other than a naïve translation of them.

Here's -fwrapv cutting performance of a function in half in Cython vs the non-fwrapv compilation: https://stackoverflow.com/questions/46496295/poor-performanc...


Sure, if you leave out error checks code runs faster. Compile mutexes to no-ops and get an even better speedup.


The optimizations impacted by -fwrapv have nothing whatsoever to do with “leaving out error checks”


I don't know how you can say that

z = x+y; if(z< x )overflowerror(); //important computation and

for(i=start, i >= start && i < n; i++)docomputation(x[i]); etc.


> The amount of code that can break as a result of signed overflows being UB, on the other hand, is huge

C++ recently decided to not make signed overflow defined, despite having the explicit opportunity to do so. Here is the reasoning:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p090...

> Performance concerns, whereby defining the behavior prevents optimizers from assuming that overflow never occurs;

> Implementation leeway for tools such as sanitizers;

> Data from Google suggesting that over 90% of all overflow is a bug, and defining wrapping behavior would not have solved the bug.

Presumably, data from Google disproves your assertion that the amount of code that breaks due to signed overflow being UB is huge.


This is a common argument, but it is flat out wrong. If, as claimed, compilers have complete freedom of choice about how to handle UB, they have the choice to e.g. make this behavior depend on the processor architecture. Compiler developers are choosing to use UB to make C semantics defective.


> Vanilla C no longer gives them the tools to do that on a modern processor

Can you elaborate on this point?


C was created during a time where instructions were executed linearly with no vectorization, memory was a flat space with no CPU caches, and there wasn’t a branch predictor that may or may not execute the correct program branch in advance. The list goes on but the rest is beyond my scope.

C was designed for a now obsolete computer architecture model and over the years this old model has essentially become an abstraction that sits between C and the CPU. As such, C programmers aren’t really programming in their CPU’s domain anymore and C, by default, lacks the commands necessary to effectively utilize these new developments. It is left up to the compiler to translate C code from the old architecture abstraction into efficient machine code for our new machines.

For a more in depth look into this topic I recommend you check out (0).

(0) - https://queue.acm.org/detail.cfm?id=3212479


Thanks for the link. Excellent article. Those are all great points and I like having them summarized in one place, because I am indeed a little behind in my modern architecture theory.

However it has always been acknowledged that C was by definition a sort of simplified computing model. For example, when I first learned see the 8086 architecture was popular but it was competing with many others and it was already dramatically different from the PDP-11 virtual machine you describe. The 286, 386 and so on had funky indexing modes and address space weirdness but so did just about every other processor of the time.

There is likely never to be a single unified architecture that anyone agrees on, and the developers of C understood this, certainly by the time the 1989 standard was hammered out. So compiler directives, pragmas, and maybe even language extensions were expected on a per CPU basis, no?


That article is nonsensical. Seymour Cray and colleagues and Tomasula and colleagues invented ILP, branch prediction, etc. in the 1960s before C was thought of. The PDP11 is much more similar to modern x86s than to the weird architectures of the 1970s.


C originated as an evolution from BCPL, with B a stopgap, as means to rewrite UNIX.

Thing is, BCPL main goal was an interim solution to Bootstrap CPL, nothing more than that.

https://en.wikipedia.org/wiki/CPL_(programming_language)

Unfortunely, UNIX's success means we got stuck with something that shouldn't be more than a portable macro assembler.


> now obsolete computer architecture model

I think this is kind-of dismissive. You seem to assume that everyone is programming modern x86 machines. What about embedded? My little PIC32/STM32/ATMEGA do not have caches or predictors, and has a flat memory space.

Thank god there is the C standard, that even today after more than 30 years, give compilers a clear set of rules allowing them to emit code for those thousands of architectures used today in the dozens of embedded devices we have in our houses/offices/industries today, and not only that, it allows to squeeze the maximum performance out of these microcontrollers, for a language that is not plain assembler.


Ah, I didn’t mean to come off as dismissive. My bad. When I wrote my comment I only had consumer hardware on my mind. However, I realize that C isn’t simply limited to this new modern hardware either.


The Chisnall article is a tutorial in incorrect Computer Architecture. PDP11s had caches by the 1970s and always had memory management. IPL was invented in the 1960s and has nothing to do with C. Branch predictors were invented in the 1960s too and one of the first machines C was ported to was the IBM370 which had super sophisticated IPL. Etc.


I was unaware. Thank you for the correction! Do you have any further reading on this topic?


I am sorry to say, I do not have a good short reference. There has to be one, though, I hope. The Hennesey Patterson books are the standard intros.



ILP, sheesh



Even on modern processors, an ADD instruction does not corrupt memory. The C standard, in declaring that an integer overflow results in all-bets-are-off UB, is not enabling compilers to provide valuable optimizations.


It would be nice if that were true, but it's not. The ability to assume incrementing a counter won't overflow makes range analysis possible in many cases where it otherwise wouldn't be. Because of this you can perform vectorization because the compiler can assume it knows how many times the loop will run.

The performance differences are not small.

You can also tell some compilers to treat signed integer overflow as wrapping-- but people don't usually do this because the optimizations you lose are valuable.


I bet the performance differences are minimal except perhaps on some contrived cases. The compiler should not assume things it doesn't know. But show me a benchmark.


There is no guarantee that an addition in C is even going to convert into an add instruction, though. Actually, on x86 it's more likely to turn into a lea because this optimizes port usage better.


This was posted by a user with a name closely matching the domain (perhaps the original author?) 16 hours prior, and flagged: https://news.ycombinator.com/item?id=27215697

When I stumbled across it last night, I couldn't understand why that would be. The content seemed good enough for readers on here, and this one's placement on the 2nd page of HN seems to confirm that. What's going on?


Good question. I'd love to know. Is there any way to find out?


Specifying that anything can be done in the presence of UB is a poor specification. The word "specify" is pretty much the opposite of "anything."

Perhaps compilers should delete all scopes with UB: much more UB would be purged from code as a result (programmers would be forced to enable compiler errors on UB).


Compilers generally do delete all scopes where they can detect that UB is certain (assuming that they can't happen and thus those sections are unreachable code) - but they can't detect and delete all scopes where UB is possible given certain input data, that would contradict the halting theorem and all that.


It's easy to pick on undefined behavior in C when you focus on the more gratuitous undefined behaviors such as signed overflow or oversized shifts. I'm not certain why these are undefined behavior instead of implementation-defined, but my suspicion is that these caused traps on some processors, and traps are inherently undefined behavior.

Instead, if you dislike undefined behavior, I challenge you to come up with workable semantics for non-gratuitous scenarios, of which I'll lay out three. Remember, if it's not undefined, then there are some defined semantics that need to be preserved, even if implementation-defined, so you should be able to explain those semantics. If you can't find such semantics, then maybe undefined behavior isn't such a bad thing after all.

The first case is the compiler hint. A function marked _Noreturn returns. Or you alias two pointers marked restrict. Remember that the entire point of these hints is to permit the compiler to not generate code to check for these scenarios.

The second case is uninitialized memory. You've probably already come up with such semantics, so I'll point out an optimization scenario that you probably don't object to that your semantics didn't cover:

  static int y; /* uninitialized */
  _Bool cond = f();
  int x = cond ? 2 : y; /* Is it legal to fold this to int x = 2; ? */
Hopefully, you'll agree that that is a reasonable optimization. Now consider this code:

  static int y; /* uninitialized */
  _Bool cond1 = f(), cond2 = g();
  int x1 = cond1 ? 2 : y; /* So int x1 = 2; */
  int x2 = cond2 ? 3 : y; /* So int x2 = 3; */
  if (!cond1 && !cond2) {
    assert(x1 == x2); /* Uh... */
  }
This is just scraping the tip of the surface of uninitialized values. Developing sane semantics around these sorts of values that also allow reasonable optimizations is challenging. You can look at the very lengthy saga that is undef, poison, and freeze in LLVM (still ongoing!) to see what it looks like in practice.

The third category is traps. Let's pick an easy example: what happens if you dereference a null pointer? Now let's consider the consequences of those semantics on some more code examples:

  int *x = NULL;
  int *y = &*x; /* Can this be lowered to int *y = x; ? */
  size_t offset = (size_t)&((struct foo *)NULL)->field; /* Remember, x->a is actually (*x).a */
  int *z = get_pointer();
  *z; /* No one uses the result of the load, can I delete it? */
  for (int i = 0; i < N; i++) {
    foo(*z); /* Can I hoist the load out of the loop? */
  }
Note that all of the optimizations I'm alluding to here are ones that would have existed all the way back in the 1980s when C was being standardized, and these are pretty basic, pedestrian optimizations that you will cover in Compilers 101.


> It's easy to pick on undefined behavior in C when you focus on the more gratuitous undefined behaviors such as signed overflow or oversized shifts.

I wouldn't even mind the signed integer overflow thing that much if there were a reasonable way in standard C to check whether a signed operation would overflow.

It's not impossible to do correctly in a compiler independent way, but ridiculously hard. And slow.



Another thing is that the wording around some of the UB issues is just plain bad. The most extreme probably is the rules around strict aliasing. That there, for quite a while, was uncertainty whether the rules allow type punning by reading a union member when the last write was to another member of a good example of not taking reality into account. Yes memcpy exists - but it is even less type safe!


The union punning trick is UB in C89 and well-defined in C99 and later, although it was erroneously listed in the (non-normative) Annex listing UBs in C99 (removed by C11).

Strict aliasing is another category of UB that I'd consider gratuitous.


> The union punning trick is UB in C89 and well-defined in C99 and later, although it was erroneously listed in the (non-normative) Annex listing UBs in C99 (removed by C11).

Right, that's my point. If the standard folks can't understand their standard, how are mere mortals supposed to?

> Strict aliasing is another category of UB that I'd consider gratuitous.

I'm of a bit split minds on it. It can yield substantial speedups. But is also impractical in a lot of cases. And it's often but strong enough anyway, requiring explicit restrict annotations for the compiler to understand two pointers don't alias. Turns out two pointers of the same (or compatible) type aren't rare in performance critical sections...

Realistically it should have been opt-in.


> Strict aliasing is another category of UB that I'd consider gratuitous.

Without it you cannot vectorize (or even internally re-order) many loops which are currently vectorizable because the compiler can't statically prove arguments won't alias otherwise.


That is completely false. Look up "restrict" and there are many other contexts. BTW, "prove" and "assume" are different things. This is an old argument, which in a just world would have been settled by Dennis Ritchie's comments.


I'm quite familiar with restrict, having added it to many codebases. It's a moderately fragile construct which in no way replaces the normal c memory model. (It's also significantly under exploited by most compilers, presumably since very little code is restrict annotated.)

Prove and assume are indeed different things, but what the compiler is able to do is _prove_ the validity of the transform within the context of the C abstract machine.

Plenty of sibling comments in this thread show concrete examples of significant optimizations lost with alias analysis (and/or non-wrapping signed integers) disabled.

If the compiler is not allowed to assume that the source code is valid then very few optimizations are safe at all. Clearly that isn't what you want, so I suppose that you would argue that being able to vectorize something like 90% of all currently vectorizable loops in existing code isn't worth the cost, and instead compilers should be able to assume that anything my alias anything unless the programmer has manually restrict annotated every variable that is written to in a loop?

I expect in that world you'd see lots of code run needlessly lots slower, and lots of other code needlessly get slathered over with restrict annotations as restrict turns into a reflex because leaving it out results in poor performance so regularly... resulting in incorrect annotations and the miscompilation you were hoping to avoid-- arguably the worst of all worlds.


> It's easy to pick on undefined behavior in C when you focus on the more gratuitous undefined behaviors ...

That is the whole point. There are scores of instances of gratuituous UB.

> I'm not certain why these are undefined behavior instead of implementation-defined, but my suspicion is that these caused traps on some processors, and traps are inherently undefined behavior.

Traps are not inherently undefined. The C standard discusses floating point traps in detail. Many details may of course be left to the implementation or platform to describe, but that's very different from saying "all bets are off".

The real reason for the gratuitous UB was misunderstanding and ignorance.

> Hopefully, you'll agree that that is a reasonable optimization.

Hopefully, you'll agree that the code is buggy, and should be fixed. We end up running expensive and cumbersome static analysis tools that try to detect situations just like this ... which the compiler itself has detected, but chosen not to warn us.


The first code isn't necessarily buggy.


I am not a compiler dev or a high skill coder so my opinion might not matter much but I'll still lay them out here.

> The first case is the compiler hint...

The compiler should refuse to compile if it comes upon such a case. Those hints are as much used by programmers as they are by the compiler. It should emit a warning and necessary checks + abort code if those hints can neither be proven nor disproven statically.

> The second case is uninitialized memory...

For the first example, unless the compiler knows that f() must always be true, it should give a compile time error. In case that it does know that f() is always true, it should still emit a warning (or give an error anyways).

> The third category is traps...

I am honestly not sure about this one. It would depend on the behaviour defined for multiple references / dereferences done simultaneously and type casting. I would still probably expect the compiler to give out warnings atleast.

Edit: language / grammar / typos


As a rule of thumb, yes, compilers should issue warnings where undefined behavior is obviously occurring. (And there is something to be said for compiling with -Werror). However, that's not going to always work, and there are two reasons for this.

The first reason is that undefined behavior is ultimately a statement about dynamic execution of the program. The set of expressions that could potentially cause undefined behavior is essentially all of them--every signed arithmetic expression, every pointer dereference, hell, almost all function calls in C++--and figuring out whether or not they actually do cause undefined behavior is effectively impossible at the compiler level. This is why sanitizers were developed, and also why sanitizers only work as a dynamic property.

For a concrete example, consider the following code:

  extern void do_something(int * restrict a, int * restrict b, size_t n);

  void my_function(int *y, int *z, size_t x, size_t n) {
    if (x > n)
      do_something(y, z, n);
  }
This code could produce undefined behavior. Or it could not. It depends on whether or not y and z overlaps. Maybe the check of the if statement is sufficient to guarantee it. Maybe it's not. It's hard to advocate that the compiler should warn, let alone error, about this kind of code.

The second issue to be aware of is that there is a general separation of concerns between the part of the compiler that gives warnings and errors (the frontend), and the part that is actually optimizing the code. It is difficult, if not impossible, to give any kind of useful warning or error message in the guts of the optimizer; by the time code reaches that stage, it is often incredibly transformed from the original source code, to the point that its correlation with the original can be difficult to divine.

So I once came across some really weird code that broke an optimization pass I was working on. It looked roughly like this (approximate C translation of the actual IR):

  if (nullptr != nullptr) {
    int *x = nullptr;
    do {
      /* do some stuff with x */
      x++;
    } while (x != nullptr);
  }
What hideous code creates a loop that iterates a pointer through all of memory? Why, this (after reducing the test case):

  void foo() {
    std::vector<std::set<int>> x;
    x.emplace_back();
  }
So the crazy code was generated from the compiler very heavily inlining the entire details of the STL, and the original code was a more natural iteration from a start to an end value. The compiler figured out enough to realize that the start and end values were both null pointers, but didn't quite manage to actually fully elide the original loop in that case. Warning the user about the resulting undefined behavior in this case is completely counterproductive; it's not arising from anything they did, and there isn't much they can to do to silence that warning.


"static int y; /* uninitialized / _Bool cond = f(); int x = cond ? 2 : y; / Is it legal to fold this to int x = 2; ? / Hopefully, you'll agree that that is a reasonable optimization. Now consider this code:"

Do not agree. That's not an optimization it's just false reasoning. Reasonable would be to either ignore it so the code depends on the uniitialized data, whatever it is, or to flag an error. Also according to the standard, static variable of arithmetic type are initialized to zero by default so there is no UB at all.

Second example has the same problem.

Third example has, for example, assumptions that C does not let the compiler make e.g. that z is a restricted pointer. Imagine that f(int c) { z = getsamepointer(); z +=1;return *z; }

And none of those optimizations existed in the 1980s.


I'll bite.

  > int x = cond ? 2 : y; /* Is it legal to fold this to int x = 2; ? */
This is permissible (on typical hardware) only if the implementation chooses to 'initialize' y to 2 (possibly skipping the actual write if it is later overwritten), since doing otherwise would result in:

  int x1 = cond1 ? 2 : y; /* So int x1 = 2; */
  int x2 = cond2 ? 3 : y; /* So int x2 = 3; */
No, no it may not do that.

Edit1: actually, as vyodaiken points out, that's only valid in the first place if y is auto, not static.

> Can this be lowered to int y = x;

Sure; the compiler is free to implement dereference in a non-trapping manner if possible.

> No one uses the result of the load, can I delete it?

Yes, it's not volatile.

> Can I hoist the load out of the loop?

I don't think so; foo might write to z.

> Note that all of the optimizations I'm alluding to here are ones that would have existed all the way back in the 1980s when C was being standardized, and these are pretty basic, pedestrian optimizations that you will cover in Compilers 101.

Yep, some of the proposed optimizations (x1x2 and hoist) aren't actually available, but they all seem like reasonable things to consider.

Edit, missed these:

> A function marked _Noreturn returns.

If written in C, the body of the function must include a return statement, and is thus a compile time error. If written in assembly, this is no different than overwriting your return address via buffer overflow and ending up in the middle of a function. The compiler should probably stick a trap instruction after the call instruction, though.

> Or you alias two pointers marked restrict.

`restrict` (formerly noalias) is not valid C[0] (or any language), no matter what the standard says; the compiler must emit a compile time error.

0: https://www.lysator.liu.se/c/dmr-on-noalias.html


> This is permissible (on typical hardware)

Hardware doesn't define C semantics. C defines C semantics. I don't understand what your conditional example is trying to say, but you seem to assume that an uninitialized y is an actual uninitialized register or memory location. It's not.

>> A function marked _Noreturn returns.

> If written in C, the body of the function must include a return statement, and is thus a compile time error.

It might contain a return statement that is never reached because the function goes into an infinite loop or exits/aborts the program or does a longjmp.


> Hardware doesn't define C semantics. C defines C semantics.

C does not define C sematics for undefined behaviour; that's what makes it undefined behaviour.

> an uninitialized y is an actual uninitialized register or memory location.

On typical hardware, yes.


>> an uninitialized y is an actual uninitialized register or memory location.

> On typical hardware, yes.

Maybe you can help me identify the register or memory location for y in your example: https://gcc.godbolt.org/z/3b4z56Y87


Link is broken; says:

> Without Javascript the regular website is not functional. To go to the noscript version Compiler Explorer click here

"click here" goes to https://gcc.godbolt.org/noscript/z/3b4z56Y87, which has:

  int choose(int cond1, int cond2)
    {
    int y;
    int x1 = cond1 ? 2 : y; /* So int x1 = 2; */
    int x2 = cond2 ? 3 : y; /* So int x2 = 3; */
    return x1 + x2;
    }
but no actual assembly.

Based on context, I'll speculate that you have found a compiler bug, that the compiler writers will refuse to fix it, and that the varible y has been optimised out because it's unused after constant propagation, regardless of whether said compiler bug causes the constants being propagated to be incorrect.


So on the one hand you're (rightly) saying that this function has no semantics under the rules of the C language, but on the other hand you're saying that the semantics that C compilers assign to it is a bug, and on the third hand you are saying that the hardware should assign semantics to it even though this code is not written in assembly language, so it's not clear what the hardware should assign semantics to? Got it.


What version of C doesn't require `static int y;` to be initialized to 0?


You're right, it should be automatic storage duration instead of static storage duration.


Compilers have become more powerful (opening up new ways to exploit undefined behavior) and the primary C compilers are free software with corporate sponsors, not programmer customers (or else perhaps Andrew Pinski would not have been so blithe about ignoring his customer Felix-gcc in the GCC bug report cited above).

This is the real problem. We have reached a situation where a small number of compilers dominate the space, but yet do not charge the users and so do not treat them as customers. The C standard is a product of an era where you would pay for your tools and so would demand a refund from any compiler vendor that would treat undefined behavior in an absurd manner.


> The C standard is a product of an era where you would pay for your tools and so would demand a refund from any compiler vendor that would treat undefined behavior in an absurd manner.

Since current compilers aren’t out to do anything malicious with UB, but instead simply treat it as “assume this can’t happen and proceed accordingly”, it’s not clear at all to me what you think paid compilers would do here instead: refuse to compile vast swaths of code that currently compiles? Or compile it very pessimistically, forgoing any optimization opportunities by instead assuming it can happen and inserting a bunch of runtime checks in order to catch it then?

In either case, I doubt there’s any real market for “pay money for this compiler and it either won’t build your code, or it will run more slowly”. I’m just old enough to remember the paid C compiler market and the thing that was driving everybody to pay for the latest and greatest upgrade was “how much better is it at optimization than before?”


I see this argument a lot, but it's silly. I don't have to care whether gcc changed the settled semantics without notice or documentation out of sincere belief that they were helping or out of a desire to beat a benchmark that doesn't matter to me, or out of habit. It's not the intent of the compiler authors that matters, but their disregard of the needs of application programmers.


What are they to do, exactly?

Optimization is a very 'generic' process and application programmers want optimization. The only sensible thing to do is to assume UB cannot occur and optimize accordingly.

What else is there?

I can already predict that whatever you suggest will very shortly end up in Halting Problem territory or will mean: No optimization. There are a lot of UBs that (if defined) would require run-time checking to define. That wouldn't inhibit optimization per se, but it would ultimately mean slower execution.


C programmers prefer control to "optimization". And if you assume UB cannot occur, you should not generate code that makes it happen. Radical UB is not required for optimization: in fact it appears to mostly do nothing positive. There is not a single paper or study showing significant better performance for substantial C code that depends on assuming UB can't happen - just a bunch of hand waving.


I don't mean to be snide, but there's no paper on it because it's pretty much something you can learn in compiler 101. Without being able to assume that UB doesn't happen, useful optimizations become impossible very quickly.


You are being snide and inaccurate. Studies show 80% or more of GCC optimization improvements come from the core simple methods. Trying to compensate for the lack of data to support your argument by claiming (falsely) it's taught in an elementary course is weak.


Please link these studies, or describe your "core simple methods", because it is likely that they rely on programs not exhibiting undefined behavior. I mean, even the most simple optimizations like eliminating unused variables or inlining functions (what if you reach into the stack to detect these?) fall apart. These were taught within the first couple weeks of my compiler optimizations class, and I certainly hope that they were part of yours, because I can't imagine what you could have possibly gone over without starting off with this.


Wouldn't the largest customers (by far) still be the companies funding development of the compilers already?


Are you arguing that e.g. Turbo C and friends from the 80s were higher quality than modern C compilers?


I believe he’s arguing that they were more sane/pragmatic compilers — they would be inherently less comfortable exploiting UB to do anything other than what people expected or were used to, because there is more real possibility of retribution (GCC could get away with making demons fly out your nose and just lose marketshare [that it doesn’t directly depend on anyways] where a commercial compiler would go out of business)


> I believe he’s arguing that they were more sane/pragmatic compilers

I can’t imagine they have any actual experience with Borland or Symantec’s C or C++ compilers, then. These things had notoriously shakey standards compliance and their own loopy implementations of certain things, along with legions of bugs – it’s not hard to find older C++ libs with long conditional compilation workarounds for Borland’s brain damage. Microsoft’s C++ compiler was for years out of step with the standard in multiple ways.

Part of the reason GCC ate these compilers’ markets for lunch was the much more rigorous and reliable standards adherence, not just the lack of cost.

This reads like nostalgia for an age that never was.


I find this article to be so refreshing. The whole "theater of security" circus around the over-blown interpretations of this term is nothing less than breathtaking in some of its applications. Though not about c, the saga of how the author of the actix web server (written in rust) was literally driven from his own project by virtual pitchfork-wielding issue reporters and redditors was a true disgrace. He dared to explore the optimization space in ways they did not approve of. Happy ending: he recovered his composure and created a new, competing project, named ntex, on his own terms.


> license for the kinds of dramatic and unintuitive transformations we’ve seen from the compilers, and any indication that undefined behavior should be a vehicle for permitting optimizations.

Does anyone have an example of a time where Clang or GCC actually did something bad upon witnessing undefined behavior, rather than simply doing nothing, as the standard proposes? I ask because every time I've seen people get mad about UB it's always because they envisioned that the compiler would do something, but instead the compiler did nothing.


This may not necessarily count as an example in the wild, but the 2013 Underhanded C contest at http://www.underhanded-c.org/_page_id_25.html includes this example:

  h = abs(h) % HASHSIZE;
  // Extra sanity check
  if (h < 0 || h >= HASHSIZE)
    h = 0;
  return h;
where h=INT_MIN causes the h to become negative and the sanity check is optimized out because abs(INT_MIN) is UB.


That's such a good example for teaching purposes, because it manages to combine the lack of understanding surrounding the remainder operator (i.e. unlike python % in c is not modulus) with the lack of understanding surrounding 0x80000000 (two's complement bane) into a single example. However it still makes my point that in this circumstance, the compiler's strategy still is to do nothing, because it can prove that the check could only be true under undefined behavior circumstances, so doing nothing means not compiling the check. I'm fine with that. Would anyone prefer that the compiler's internal definition of logic assume that absolute values can be negative? Must we throw out centuries of math because we've optimized integers to be 32-bits?

The only thing that's problematic is we need better tools to bring logic assumptions to our attention. Currently, UBSAN can only warn us about that when two's bane actually gets passed to the function at runtime. So the only way to spot faulty logic we failed to consider is to both enable UBSAN and be super methodical about unit testing.

Well, another thing I like to do is just read the assembly output. Constantly. Whenever I write something like a parser I've got a keyboard mapping that shows me the assembly output in Emacs with UBSAN enabled. If I turn off the noisy ones like pointer overflow then I can avoid these issues altogether by writing my code so that no UBSAN assembly gets generated. Since the compiler won't show that as a warning. You literally have to read the -S assembly output to get the compiler warnings that are actually meaningful.


I wish there was an abs() function that returned the unsigned version of the signed input type and wasn't UB if you passed in the most negative value of the signed type. Sadly neither C or C++ nor Rust's standard libraries have a built-in implementation of this, and sending the most negative integer into Rust's .abs() will panic in debug mode (is not intended to be used in properly implemented code).


A fun historical example: https://feross.org/gcc-ownage/

But yes, generally, UB manifests as "the compiler is allowed to assume this doesn't happen" and the bad stuff is a consequence of doing things following those assumptions, not the compiler going "oops well lol UB time to screw with this person."


I've seen a real-world example something like this:

   int a[32] = {...};

   int flag = 1 << index;
   if (index < ARRAYSIZE) {
      a[index] = x;
     return flag;
   } else {
      return 0;
   }
The "1 << index" operation is undefined behavior (!) when index is greater than 32 (on a platform with 32-bit integers), even if the result is never used!

The compiler inferred that index must always be less than 32, which allowed it to optimize out the array bounds check, which turns the code into a write-anywhere gadget.

Note that if the standard had not declared "n << 32" to be all-bets-are-off UB, but instead had said something like, "it results in some implementation-specific value, or maybe traps" -- as a rational person would presume -- then this would not turn into a security problem.


> Note that if the standard had not declared "n << 32" to be all-bets-are-off UB, but instead had said something like, "it results in some implementation-specific value, or maybe traps" -- as a rational person would presume -- then this would not turn into a security problem.

But also note that a lot of existing code doing bitshift-and-index inside a hot loop that never went out of bounds would now get slower if it started having to run bounds checks it had previously elided in an optimization pass.

Let's not pretend that "it results in some implementation-specific value, or maybe traps" is a clear win with no downsides that Standards Authors and Compiler Engineers are ignoring out of some kind of malice – there are very real performance tradeoffs here, and a new version of the standard that makes a lot of existing real-world code slower isn't going to be a popular one with many people.


It isn't clear to me precisely what example you have in mind.

If you are saying that deleting array bounds checks might have performance benefits that outweigh the security concerns, then I disagree.

If you are saying that the compiler would have to insert bounds checks, I don't see how you arrive at that.

I have seen claims that gratuitous UB is important for enabling meaningful optimizations, but in every such case the examples did not hold up to scrutiny. In the end, the same optimization remains possible without the gratuitous UB, although it might involve a little more work on the part of the compiler engineer.

Regarding "malice": "Never attribute to malice..."


There are a half-dozen examples on this very thread and in linked bug reports, with detailed explanations by professional compiler writers.

If you think they don’t hold up to scrutiny, then you should get to work implementing these things, because you are likely a better compiler writer than most others in the world, including Christian Lattner of llvm fame, who provides many examples here.

https://blog.llvm.org/2011/05/what-every-c-programmer-should...


> If you are saying that deleting array bounds checks might have performance benefits that outweigh the security concerns, then I disagree.

I'm saying that there is existing code in this world in which some variation on

       /* insanely hot loop where ARRAYSIZE > 32 */
       while(true) {
           ...
           int x = 1 << index;
           if (index < ARRAYSIZE) {
               a[index] = x;
            } else {
                 a[index] = 0;
            }
            ...
       }
exists that's currently compiling down to just "a[index] = 1 << index", with everything working fine.

I'm saying that the authors and their customers are unlikely to be excited when your new compiler release stops assuming that index is < 32 (which it always was in practice) and their program gets slower because there's now extra tests in this part of the hot loop, which is also consuming more icache, evicting other important bits of the loop. "There's some work-around to win that performance back, given enough effort by the compiler author to give you some means to tell it what it had previously assumed" isn't likely to sell people on your patch, particularly if they'd have to make many such annotations. "They could just remove the tests if they know that index < 32" in this synthetic example, yes, but there are cases when this is less obvious but nonetheless true. And compiler updates that force you to go delete working code, work out un-obvious deductions the compiler had previously made, and re-validate just to regain the status quo still aren't going to make anybody happy.

The point, broadly: People care a lot about performance. These UB discussions in which people blithely assert that compilers "should" do XYZ conservative assumption while eliding any mention of the real-world performance impact the changes they want would have on existing code are, frankly, masturbatory.

Compiler engineers have to care when PostgreSQL and a dozen other critical programs get 4x slower because they stopped assuming that "1 << index" wouldn't happen with index > 32, or that loop bounds won't overflow. Like all software engineering, decision making here has to be driven by balancing tradeoffs, not by insisting that one treatment of the spec is obviously "the best approach" while ignoring any inconvenient consequences that change would have vs the status quo.


(I'll assume int is 32 bits, the hardware generates zero on shl overflow, and we don't care if a[31] is negative, since "everything working fine" wouldn't be true otherwise.)

The compiler is perfectly capable of seeing 1<<index and reasoning: if index >= 32, then x is 0 (on this hardware), so the two branches of the conditional are the same, and we can just use the first one. On the other hand, if index < 32, then index < ARRAYSIZE (transitive property), the conditional always picks the first branch, and we can just use that one. By exhaustivity, we can just use the first branch: `int x = 1<<index; a[index] = x;`. Further optimization produces `a[index] = 1 << index`.

Note that that did not make any reference to undefined behaviour.


There is zero customer demand for less reliable code as a tradeoff for "performance"


Certainly not when C was designed, and arguably not in some cases today as well.


For some reason I couldn't 'Reply' to compiler-guy's reply directly, so I'll try here.

I'm familiar with the Chris Lattner article. Most of it (especially the second installment) shows bad outcomes from UB optimization. When it comes to signed integer overflow UB, I see two examples where performance is cited as a motivation.

One mentions unrolling, and gives an example similar to one elsewhere in this thread: https://news.ycombinator.com/item?id=27223870 In my reply to that I explain how unrolling is not actually enabled by UB.

The other instance of integer overflow UB in the Lattner article is optimizing X*2/2 to X. That's perhaps a stronger case, but I haven't seen any numbers on the real-world implications of this particular optimization.


> For some reason I couldn't 'Reply' to compiler-guy's reply directly, so I'll try here.

Hacker News has a timeout where if you try to reply to someone too quickly, it will hide the reply button. This timeout increases the deeper the comment tree gets.


I think the compiler that you were using is broken. One can't infer "index < 32" unless on a code path to a use of "flag", and that inference can't override the predicates that dominate that use.


No, the initializer already counts as a use of the undefined expression.


You're right. Thanks!


The types of surprises that I've seen have to do with inferences that the compiler draws: this program would be undefined if foo was negative, therefore foo is not negative, therefore I can optimize away this condition.


What do you define as "bad"?

Arguably several.

I've seen cases of overflow checks that were implemented assuming signed overflow (which all relevant platforms implement!) getting optimized away. Correct, given "signed overflow is UB" and thus can be assumed to not happen. Problematic given for widespread such checks are, and given that there's no easy portable alternative.

Entire checks getting optimized away because of a presumed strict aliasing violation. IIRC between structure with a compatible layout. Pretty code, no. UB yes. Reasonable, IDK.


Sounds like you're interested specifically in https://blog.tchatzigiannakis.com/undefined-behavior-can-lit...


True story -- someone (not me) decided to initialize a C++ virtual class by defining a default ctor which does a "memset(this, 0, sizeof(*this))", then uses placement new to re-create the vptr.

Newer compilers complained more and more until gcc 8 which just silently ignored this awful hack -- no diagnostic, no error, just crickets.

Strictly speaking silently ignoring this atrocity is allowed by the standard, but it sure took a while to figure out. So be careful with code that "just works" even if it shouldn't.


Perhaps you should file a bug to make them bring back the warning.


A related post with a similar claim: "How One Word Broke C" (https://news.quelsolaar.com/2020/03/16/how-one-word-broke-c/). HN comments at https://news.ycombinator.com/item?id=22589657


The main problem with undefined behavior is that the compiler does not produce any warning to that effect (in my case gcc). I've been caught by bugs that were truly puzzling until I understood undefined behavior, and the licensee it gives to the makers of compilers. I think undefined behavior is giving C a reputation worse than it deserves.


To me the real problem is that compilers these days do too much. Nobody asked for these optimizations, if you want optimized code, use C++, or Rust, or whatever other "modern" language that has higher level constructs and thus can optimize the code better.

The reason to use C is to have an "high level assembler", and there is no reason to use C if I can't rely on the output of the compiler, and if the compiler eliminates some code that I write. We all know that happens when a signed integer overflows, we all know that most architectures are little endian, etc.

Even if something is "undefined behaviour" for the standard, it usually has some well defined behaviour on the platform that you are targeting.

Unfortunately, for these reasons gcc is dangerous to use, it's not a problem of desktop application (if it crashes, who cares), but I talk about critical systems. It would be unacceptable if a machine ends up killing someone because at gcc they though that they could remove a check that they thought it was useless. And so you have to spend thousand of $ just to get reliable compilers that doesn't do silly optimizations.

I think that they should make a sort of "gcc lite" that does exactly what (at least to me) a C compiler has to do: translate the code from C to assembly, without changing the semantic of the code. If there is an undefined behaviour that behaviour will be translated to the assembly code and you will let the processor to deal with it.

Also, we talk about optimizations that could be made by exploiting undefined behaviour, fine, but also the programmer usually optimizes exploiting the same undefined behaviours by hand.


It truly boggles my mind that this discussion (linked in TFA) played out the way it did. In my mind it's 100% not okay to (by default) optimize away a check for integer overflow. I've never really written in C (or any unsafe language) before so I had little context for the types of traps C sets for developers. Based on the responses of the person who presumably implemented the optimization, it comes as no surprise that C is so dangerous. After reading this I hope I never have to use gcc.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475


> In my mind it's 100% not okay to (by default) optimize away a check for integer overflow.

But a + 100 > a is not a check for overflow. If a >= INT_MAX - 100, it is an overflow. The "will this operation overflow" check would be a >= INT_MAX - 100, and GCC would not optimize that away.


What you've written doesn't demonstrate the issue outlined in the big report. The issue is that real-world code defends against certain hostile inputs like so, and that the optimization breaks those defenses:

    int a,b,c;
    a=INT_MAX;     /\* statement ONE */
    b=a+2;         /* statement TWO */
    c=(b>a);       /* statement THREE \*/
Whether or not you think this is technically permissible by a sufficiently self-serving (from the perspective of a person whose ONLY goal is speed optimization) reading of spec is irrelevant. Any reading of the spec should err on the side of protection against real-world consequences.

I don't want lawyering and technically correct-ing, I want pragmatism.


If you want pragmatism, check whether something is legal before you do it.


This is addressed quite thoroughly in the thread that I linked. It is generally impossible to determine whether your code has a bug, and you cannot possibly be responsible for every bit of code that your code may happen to touch, but even if you could, mistakes and bad practices will happen. Practically speaking, it does not matter if you can find a way to displace blame onto the writer of the code. Placing blame is not how one minimizes harm. Maybe we have different definitions of pragmatism in our heads.

I would also say that leaving such a common occurrence as integer overflow as undefined behavior rather than to consider that it practically always has platform as implementation specific behavior doesn't make much sense. I mean, it does always have platform specific implementation, right? Does that not exclude it from the definition of undefined behavior? Genuinely curious.

The downvotes on my previous comments tell me this sort of thinking is endemic. I suddenly have more insight into the concern people have about software development as a field.


I think the best point in this article is that C, a language invented to write OS code, is currently difficult to use for that purpose due to the current handling of undefined behavior. If you write low level OS code, you are keenly aware of the behavior of the machine you target. There is no point trying to define a language like C in such a way that the only valid programs are those that would run identically on any hypothetically allowed hardware. For instance, on x64, there are no alignment restrictions, why should I be punished for making use of that fact? And what about all this C code with SSE intrinsics? should we ban that too?


Sort of a second amendment for footguns.


https://news.ycombinator.com/item?id=17454467 made very much the same argument.



That only got 1 comment.


Just like this one, when I posted mine here


Linking a failed discussion from another (possibly) failed discussion seems kind of pointless.


Just like this thread.


I assumed your labelling it a dupe and giving a link meant there was a more interesting discussion on that page. Nope. So I left my comment to warn others not to make the same mistake I did. You didn't seem to understand, so someone tried explaining, then you left another obtuse, mean-spirited comment.


There is such flags as -fwrapv which I use in programs that I think may need it.

Maybe, there should be another flag which defines or partially defines many more things. For example, shift amounts out of range might produce any numerical answer (without other side effects), or it might trap once that operation is reached and not give any answer at all, but not demons in your nose or anything like that; either way, the compiler might or might not display a warning message. This is similar to what is mentioned in the linked document.

Another alternative flag might be used to guarantee terminating with an error message in such a case, making the program execute more slowly (because the compiler will insert such a check), for testing purposes perhaps. The document linked mentions such a case, but I think that it should not be the default case.

If you have:

  int x[3];
  int*y=x+1;
  int*z=x-1;
Then, what I think is how it should work, should be:

- The program itself is valid.

- The pointers y and z are unequal; if you write y!=z then it is guaranteed to be true.

- Reading or writing through z should be considered as undefined behaviour and not allowed. The compiler can optimize based on the assumption that it doesn't happen, unless that optimization is disabled, or if the program is changed to specify it as volatile (in which case it remains undefined behaviour, but the compiler can no longer assume this when optimizing).

- Since z is out of range, comparing if it is less or greater than y is not predictable.

- Since z is out of range, there is no guarantee that it is unequal to any other pointer that isn't based on x, including null; it might or might not be null.

- If you write z+=2 then z is now in range and z==y is true and you can read/write through z now.

If you write:

  static int f(void) {
    int a;
    return a;
  }

  int main(int argc,char**argv) {
    int x=42;
    x=f();
    printf("%d\n",x);
    printf("%d\n",x);
    printf("%d\n",x);
    return 0;
  }
Then it should be: The return value of f() will be some value for each call (not necessarily the same every time), and can optimize it. Since the function does nothing, and the return value is not defined, it is allowed to optimize out the reassignment to x and always print 42, or it can set x to some other value, which is unpredictable and not necessarily the same every time, but not print three different numbers, trap, or do other stuff. If standard library optimizations are enabled (which they might be by default, depending on the compiler), it may optimize out the entire program and replace it with a constant string and printing it, as long as it is a number and line feed, repeated three times with the same number, which must be in range of the int type. It may also compile the program like it is and when the program is executed multiple times, print different numbers different times, but always three same numbers. Printing nothing is not allowed, since the %d format specifier can never result in nothing; it always results in at least one digit.


Lots of us use languages that have no/almost no undefined behavior, it is really weird to see people trying to come up with reasons why this is hard. Perhaps there are still computing environments so slow that undefined behavior still makes sense in C, and C should remain that way. If so, use something with less of this for anything else.


If C is just a portable assembler then what if the assembly itself has undefined behaviour. :)


This exists, but the effect of undefined behavior in CPU architectures is a little bit more forgiving than the interpretation of UB in C to mean "literally the entire program has no meaning". Instead, usually the program will execute correctly up to the invalid instruction, and then something happens, and then the CPU will continue executing from that state. It's actually fairly difficult to build an instruction with undefined behavior that contaminates unrelated parts of the program.

Though it HAS happened: notably, brucedawson explains here [1] that the 360 has an instruction so badly thought out that merely having it in an executable page is enough to make your program otherwise meaningless due to speculative execution.

[1] https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-d...


> This exists, but the effect of undefined behavior in CPU architectures is a little bit more forgiving than the interpretation of UB in C to mean "literally the entire program has no meaning".

That is not quite what the interpretation of UB in C is, AFAIK. UB in C is generally interpreted as meaning that any path which would trigger an UB is invalid, because if it will be invalid once the UB is reached, well, if we know for sure we're going to UB for the instruction before it, we can say that we're already in UB.

Whole-program invalidity can occur when the compiler manages to infer that no execution path is UB-free, in which case yes the program is meaningless. More generally, programs will go off the rail as far ahead of the UB as the compiler managed to determine that the UB would unfailingly be reached.

And it's usually because the compiler works backwards: if a path would trigger an UB, that path can not be legally taken, therefore it can be deleted. That's why e.g. `if (a > a + 1)` gets deleted, that expression makes sense if you assume signed integers can overflow, but the compiler assumes signed integers can't overflow, therefore this expression can never be true, therefore it's dead code.

This is important, because many such UBs get generated from macro expansion and optimisations (mainly inlining), so the assumption of UB-impossibility (and thus dead code) enables not just specific optimisations but a fair amount of DCE, which reduces function size, which triggers further inlining, and thus the optimisations build upon one another.


The situation is different, because a CPU is by definition an interpreter. It does't perform code transformation, at least not at a higher level as a compiler. The CPU only looks at the next few instructions and perform them. A compiler, however, is responsible for taking a large coding unit and produce a transformation that is efficient. That process requires thinking about what is invalid code and operate on that.


Wow! Interesting to see hints that meltdown exists years before it was officially published.


Skimmed the article and didn't see a reference to it, you may be interested to know that our good friends and protectors at the NSA may have stumbled on to Meltdown-like issues in the mid 90s

https://en.wikipedia.org/wiki/Meltdown_(security_vulnerabili...


I see the NSA strategy for 'securing' the nation against technology threats in their 'unique' way was going strong back in 1995.


They "secure" the country by exploiting vulnerabilities and leaving everyone else in the dark. They see the world as just a game between them and other foreign surveillance institutions.


There actually is a fair amount of truly undefined behavior for CPUs, but it's always at system/kernel mode rather than userspace for security reasons. You can search an ARM ISA for "UNPREDICTABLE" to see examples.


I seem to recall that Intel and AMD CPUs will behave in strange and unusual ways, particularly when it comes to things like bitshift op flags, if you shift by out of range values, or by 0 or 1. So I guess undefined behaviors in C are somewhat consistent with CPUs. But as other people mentioned Intel is much more forgiving than X3J11. If you ever wanted to find all the dirty corner cases that exist between ANSI and hardware, I swear, try writing C functions that emulate the hardware, and then fuzz the two lockstep style. It's harder than you'd think. [don't click here if you intend to try that: https://github.com/jart/cosmopolitan/blob/master/tool/build/...]


It does: reading uninitialized memory, simultaneous writes from multiple threads, using memory below the stack pointer with interrupts enabled, ...

Some of C's UB is due to this, some of it is due to the compiler.


It's not though is it


I dislike how UB is used where unspecified would work.

For instance overflowing arithmetic is well specified on every architecture - the behaviour may be different on say sparc vs an hc12 vs x86, but on any of those architectures it will always be the same. Yet compiler devs have instead decided that it is undefined and so can be treated however they like.

There are so many of these UB that could be unspecified that it’s absurd - UB should be reserved for when there outcome is not consistent - UaF, invalid pointers, out of range accesses, etc


The C++ committee recently (for the C++20 standard) voted against making signed overflow defined, opting to keep it undefined. Primarily for performance reasons (and because it would usually be a bug anyway).

www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r4.html#r0r1

You're of course free to disagree with the reasoning. But you'd probably agree that a new standard revision that forces the average application to become, say, 5% slower, just so that broken programs would be just as broken, would not be very well-received.


I’m not willing to accept a “say” for performance here.

I’d also want performance impact on a non-benchmark perf sensitive code base (browser, database, etc rather than spec).

The problem with the claim that the code is incorrect is that it is only* incorrect due to the imo bogus interpretation of what should be UB.

If I write

    If (a+b<a)
On any modern architecture that has a well defined and understood meaning. But compilers have repeatedly overridden I well understood behavior in the name of perf benefits that only turn up in Spec-style benchmarks


I think the problem is cultural in the C community. C programmers have Stockholm Syndrome around UB optimizations.

As TFA notes, "There is No Reliable Way to Determine if a Large Codebase Contains Undefined Behavior" https://blog.llvm.org/2011/05/what-every-c-programmer-should...

That's because UB is a bug that occurs as your program runs (e.g. dereferencing a null pointer). You'd have to prove that your program is free of bugs to prove that you won't encounter UB at runtime, which is not possible in general.

But C programmers believe they deserve to be abused this way.

"Sure, the C compiler is allowed to nuke your entire program whenever there's a bug at runtime, but all you have to do is prove that your program is bug free! Why, any real C programmer can write bug-free code, so if you have any UB bugs, you're probably not good enough to write C. Sure, I've been bruised by UB bugs in the past, we all have, but when I made those mistakes, I deserved to be bruised, and those bruises made me a better programmer."


I expect there's already been a lot of evaporative cooling in the C community and there will only be more over time. Increasingly the people who are going to be left in the C community are precisely those people who think C is generally OK. Anyone who has developed a high degree of fear of C is ever-increasingly doing what it takes to move away.

I have tried to phrase this neutrally, as to not be a comment about whether or not C "really is" generally OK. I am not sure I have 100% succeeded, as I have my opinions. But the point stands regardless; the people in the C community are generally going to be the ones who are OK with C.


> But C programmers believe they deserve to be abused this way.

I don't know of any C programmers who think that way. We accept the state of affairs because there is little alternative. We're not going to rewrite our code in another language, because we mostly need to write in C for various reasons. Also Rust inherits LLVM's undefined behaviour, so it's no panacea. We mostly keep adding flags to turn off the most excessive optimisations and pray.


Rust inheriting it is the equivalent of a codegen bug, and they get patched. The cause here isn’t LLVM, it’s language semantics.


Anecdotally, I find about as many bugs in my Java code at work, as I do in my hobby C code, including "dereferencing null pointers" aka NullPointerExceptions.


"Undefined behavior" is a term of art relevant to C, meaning that the standard no longer has any comment about what happens. Thus the comments about launching nukes, or destroying your machine, etc., being standards complaint, even though obvious real compilers won't actually emit code that does that.

Dereferencing a nil pointer in Java is not undefined behavior. It is defined; it throws a NullPointerException, and "it throws a NullPointerException" is a very specific term of art in the Java community that is very precisely defined.

Many languages don't have an explicit concept of "undefined behavior" in their standard (though, of course, due to deficiencies in the standard they may have implicitly undefined behavior), and of those that do I doubt anyone matches C or C++ in the commonly-used category of languages.


In single-threaded programs, Java will neatly do what you expected, it lacks this excitement from C, when you try to dereference a null pointer, iterate past the end of an array or whatever in Java you'll just get some sort of Throwable...

But in concurrent programs things get exciting again. Java would like to be somewhat fast on actual hardware, and the actual hardware behaviour of memory is exciting, so, Java is also exciting, depending on how your hardware works.

You won't get C's full-blown "Undefined behaviour", but you can get some pretty astonishing outcomes, for example time travel can happen. If one thread changes A, then B, then C, it is permissible in Java that from the perspective of another concurrent thread the value A is unchanged, after B and/or C have changed.

It turns out humans are bad at coping with this even though it's much less scary than Undefined Behaviour, and so some more modern languages than Java offer ways to get memory behaviour that doesn't hurt your brain as much in exchange for reduced performance.


Point taken about NPEs being defined behavior, but from a practical point of view, a bug is a bug (and bugs is what the parent comment was referring to).

Whether the bug was due to a well-defined or undefined behavior seems like an exercise in assigning blame to another entity (the language, the committee, the compiler, etc).


Correct assignment of (technical) blame is an important engineering task, though. You sound to me like you are thinking that's some sort of wrong thing to do, but I would disagree. Identifying whether the bug is a defined (in C) behavior or the result of the compiler making a certain choice as the result of your code invoking undefined behavior is an important element of fixing the bug; if you don't know which is which you're operating under a critical handicap.


NPE isn't really a relevant comparison point - it raises an exception of sorts in both languages. Accessing an array past its end, accessing freed memory, reading uninitialized memory seem more apt.


These are all issues a compiler could insert checks for, saving you the time to do it manually, while trading performance for security or correctness.

C doesn't trade performance away, so if you'd like to pay the price of these checks, it's on you, the programmer to add them in. C programmers have to put in extra effort to make the executables safer and output correct results.

Other languages trade off performance for safety and/or correctness. The programmers using them have to put in extra effort to make the executables run as fast as they do in C.

Ultimately, programmers tend to make rational choices, and the stockholm syndrome mentioned above is really just C programmers dealing with the consequences of the trade-off they made by using C.


C is designed to allow programmers to insert or remove checks as performance tuning. Compiler UB "optimizations" that remove that ability from the programmer make the language unusable.


Not knowing what is UB in C is equivalent to not knowing a core part of the C language. In that situation, yes, C would seem unusable for people who have superficial knowledge of it.


Nobody knows UB rules. The WG14 discussions are full of confusion.


Send Linux Torvalds a note and tell him he has only superficial knowledge of C.


The situation has gotten at least somewhat better since then. Ubsan and friends are not a guarantee, but make it much more realistic to find UB issues.


Good point. C programmers have come to assume that C is a shitty language with terrible semantics and it's their fault for not using Rust or something. They blame the language for the mess made by standards/compilers.


I don't really like Rust; I like C. But, it could be improved, making one with less undefined behaviour and better macros and less confusing syntax for types etc. Many of the new programming languages that try to avoid the bad thing from C I think are avoiding most of the good thing from C, too.


You don't get it: if you are ready to give up top performance, you can choose among a multitude of more forgiving languages.

There is simply no point in having a C language with less than top level optimizations.


Name a feature of C that makes it inherently "faster" than another compiles-to-machine-code language. I can name plenty that make it slower.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: