More

pascal_cuoq · on March 4, 2022

Yes, the first two examples in the article weren't obviously enough undefined for the authors of CIL who wrote the list, apparently.

Liquid_Fire · on March 4, 2022

Does the first example contain undefined behaviour? (As written it does because x is not initialised, but the text suggests that a value is actually being provided for x)

I think the second example may contain UB on a <=32-bit architecture (right shift by a value greater or equal to the number of bits), or at least this is UB in C++. On a 64-bit architecture it would be fine (but the result would not be 0).

pascal_cuoq · on March 4, 2022

This list contains several invalid items mixed with the good ones. It starts:

     Why does the following code return 0 for most values of x? (This should be easy.)

      int x;
      return x == (1 && x);

The answer is that the code can return what it wants or make demons fly out of your nose, because using the automatic variable x without initializing it is essentially UB (UB in simple words in C89, unarguably UB in C11 because the address of x is not taken, and debatable in C99 but only because of poor choices of words). But I don't think this is the answer that the authors are thinking of.

“It is UB” also applies to “(1 - sizeof(int)) >> 32”, the next question, on ILP32 architectures that were still prevalent when this page was written (shifting an integer type of width 32 by 32), regardless of the discussion the authors want to have about the type of sizeof(t).

“Functions and function pointers are implicitly converted to each other” is one way to describe what the C standard actually says, but that makes it look more complicated than it is. In reality, functions decay to pointers to function in the same way that arrays decay to pointers-to-first-element, and if you are familiar with the latter, it's a good way to understand the former. Only function pointers can be applied. When you write “f(x)”, f first decays to pointer to function, and then is applied. The reason you don't need to dereference a pointer-to-function p when you apply it as “p(x)” is NOT that p will be converted implicitly to a function, but that function application expects a pointer to function.

The first example in 16.3 is also Undefined Behavior, regardless of the target architecture, because the type of “3” is always “int”, so it's a poor illustration of the VC compiler bug they are referring to.

shantnutiwari · on March 4, 2022

I came here to say this-- for many of the questions, the answer is Undefined Behaviour.

The very 1st one-- I havent done embedded C for a long time, but the 1st thing I was taught was *not* to assume uninitialised variables would be set to 0. The author probably tested on a known safe (read lab like) system.

>> can return what it wants or make demons fly out of your nose,

Indeed, this is the correct answer.

Most the questions on the page seem to be "Let's do weird crap highly dependent on the architecture/compiler, using undefined behaviour, and LOL, we can then blame C"

And not defending C here at all, moved away from it years ago. But there are better criticisms than this

MauranKilom · on March 4, 2022

It explicitly says "for most values of x", i.e. it assumes x to be initialized (but doesn't show so in the code).

leetcrew · on March 4, 2022

not a good way to give the example then. most c or c++ developers are going to start twitching uncontrollably the second they see a variable declared uninitialized and then immediately read from.

dataangel · on March 4, 2022

they're just doing a poor job of saying assuming x is declared an int somewhere else why would this be true

pascal_cuoq · on Feb 19, 2022

The sentence you quote is near to the words:

“[The C++ standardization committee] WG21 has recently adapted the changes promoted in their document p12363. Generally, C++ goes much beyond what is presented here:”

I would be extremely surprised if the proposal to make signed arithmetic overflow defined behavior in C made it into C23. The window is narrowing and this would be a very big change to the language. Making it official that 2's complement is the only representation for signed integers is already a large change.

Later in the decade, maybe.

pascal_cuoq · on June 24, 2020

You need to look at the disassembly of the generated binary to make sense of this sort of performance variation (paying attention to line cache boundaries for code and data), and even so, it is highly non-trivial. The performance counters found in modern processors sometimes help (https://en.wikipedia.org/wiki/Hardware_performance_counter ).

https://www.agner.org/optimize/microarchitecture.pdf contains the sort of information you need to have absorbed before you even start investigating. In most cases, it's not worth acquiring the expertise for 5% one way or the other in micro-benchmarks. If you care about these 5%, you shouldn't be programming in C in the first place.

And then there is this anecdote:

My job is to make tools to detect subtle undefined behaviors in C programs. I once had the opportunity to report a signed arithmetic overflow in a library that its authors considered, rightly or wrongly, to be performance-critical. My suggestion was:

… this is not one of the subtle undefined behaviors that we are the only ones to detect, UBSan would also have told you that the library was doing something wrong with “x + y” where x and y are ints. The good news is that you can write “(int)((unsigned)x + y)”, this is defined and it behaves exactly like you expected “x + y” to behave (but had no right to).

And the answer was “Ah, no, sorry, we can't apply this change, I ran the benchmarks and the library was 2% slower with it. It's a no, I'm afraid”.

The thing is, I am pretty sure that any modern optimizing C compiler (the interlocutor was using Clang) has been generating the exact same binary code for the two constructs for years (unless it applies an optimization that relies on the addition not overflowing in the “x + y” case, but then the authors would have noticed). I would bet a house that the binary that was 2% slower in benchmarks was byte-identical to the reference one.

voldacar · on June 24, 2020

If I may ask, what was the use case for this code that they cared so much about a 2% difference in benchmarks? Aerospace? Game engine? Packet routing?

sbierwagen · on June 25, 2020

I wouldn't expect aerospace, since I have been told embedded programmers in that field routinely disable compiler optimization, in the chance that a compiler bug or overzealous UB exploitation might introduce a bug into previously working code. Hard realtime requirements demand fast code, but not necessarily efficient code.

cybervasi · on June 25, 2020

I am guessing your tool was source based to even detect this, let alone the fact that the code change would have produced the identical code.

gnufx · on June 24, 2020

Performance counters are vital, and you don't need to grovel the disassembly yourself in association with profiling, even if it's feasible. Get a tool to do it for you; MAQAO is one in that area (x86-specific, and unfortunately part-proprietary).

Anyway, yes, measurements are what you need, rather than guesses, along with correctness checks, indeed.

saagarjha · on June 25, 2020

I've had this exact situation happen to me as well :/ It's frustrating.

pascal_cuoq · on April 18, 2020

Could you clarify which clause of the C standard you are referring to when you say “due to aliasing, not due to alignment”?

I make sense of the C standard for a living (this is literally my day job) and I do not see what clause of the C standard you are referring to. It would be very useful to me to know which clause you are referring to, and I would be eternally thankful.

loeg · on April 18, 2020

https://news.ycombinator.com/item?id=22911286

pascal_cuoq · on April 19, 2020

So the clause about assignment? Thanks, that was helpful.

pascal_cuoq · on April 18, 2020

I have heard this reaction to this article a lot, but sorry, there is nothing in the C standard that says that objects should not overlap, except a rule that only apply to “lvalue = lvalue;” assignments and is not relevant here.

Plus on a some 32-bit ISA, a long long and a double only need to be aligned to 32-bit boundaries, so I note that in the made-up C rules that you are referring to, “basic type” is not very well defined.

> I believe this behavior is actually specified in the standard, actually,

> in the same section that defines the aliasing rules.

The strict aliasing rules are here: https://port70.net/~nsz/c/c11/n1570.html#6.5p7

Go ahead and point to the rule that says that “basic types” cannot overlap with themselves.

ajross · on April 18, 2020

I genuinely can't follow you completely, but half suspect you're violently agreeing with me. Are you saying the optimization in the linked article is, or is not, in violation of the standard?

Edit: this is the text I was remembering, from 6.5.16.1 ("Simple Assignment"): "If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined.".

That pretty much matches exactly what I was saying: compilers are free to assume that basic types don't overlap, because if they do then any generated code will be undefined behavior anyway.

pascal_cuoq · on April 18, 2020

6.5.16.1 is the “rule that only apply to “lvalue = lvalue;” assignments and is not relevant here”

It does not apply to “lvalue = 1;” or to “lvalue = 2;”, which are the two relevant assignments in the example in the article.

For context, I think I made it clear in the article that the program being discussed is UB, and therefore that the compiler is not to blame. But since I wrote this article, I have had people telling me “The complaint isn't about alignment at all, it's that the optimizer assumes that two pointers to the same basic type cannot overlap in memory”.

My reply to this specific sentence is:

No. You are wrong. There are no words in the standard that say that “basic types cannot overlap in memory”. There is not even a notion of “basic type”. There are clauses about pointer alignment, that are explicitly cited in the article, and there are clauses about strict aliasing, that are shown in the article not to be the reason for GCC optimizing the program by using -fno-strict-aliasing. There are no rules about “basic types not overlapping” in the C standard. You only think there are. Or please cite them. (6.5.16.1 is a rule about assignment, it only applies for the code pattern lvalue1 = lvalue2;)

ajross · on April 18, 2020

I'm sorry, can you explain how that's not relevant here? You're being incongruously combative, but I still think you're mostly agreeing with me.

The section on "simple" assignments doesn't say that the rvalue must be an lvalue expression syntactically . I think it applies very well to "*p = 1;", which is the statement in the linked code. What am I missing?

> There are no words in the standard that say that “basic types cannot overlap in memory”.

I don't believe I said there were. I said the standard expressly allowed the optimization in the linked article. And as far as I can see, absent a clearer explanation for why that section doesn't apply, it does.

pascal_cuoq · on April 19, 2020

1 is not an object, but even if it was one, it would not be an object that overlaps with “* p”.

You are interpreting the C standard as if it were a philosophy text. It contains a rule that says that in very precise circumstances (for an assignment from one to the other) objects must not partially overlap, and you are claiming that it means that “two pointers to the same basic type cannot overlap in memory”. The clause does not say that, sorry. The clause applies to the objects that are on one side and the other of an assignment.

> I said the standard expressly allowed the optimization in the linked article.

I hope that the article makes it clear that the standard expressly allows the optimization. Specific, explicit rules, cited in the article, about pointer alignment, allow the optimization.

For this reason, I, “violently” as you say, disagree with the sentence “The complaint isn't about alignment at all, it's that the optimizer assumes that two pointers to the same basic type cannot overlap in memory”. This sentence gets it all wrong. It is about alignment; it is not about “pointers to basic types”, whatever that is, not being allowed to overlap in memory; they are allowed to overlap for large enough “basic types” because it is about alignment, not overlap:

https://gcc.godbolt.org/z/ZAMkeH

You could argue that GCC 9.3 only missed the optimization in the example in this Compiler Explorer link for some other reason and that absence of optimization doesn't mean that p and q cannot overlap. This would be correct, this aspect is one of the difficulties in studying the rules that these compilers implement. However, what I am saying is that if you reported this missed optimization to GCC developers, they would tell you that GCC can't optimize the function f because p and q can overlap. There is no clause in the C standard that prevent them to (apart from strict aliasing rules, but I used the option to tell the compiler I didn't want it to take advantage of these ones).

(Please do not bother them with this, or if you do, at least leave me out of it; I have nothing better to do than to write this because it's the week-end but they have better things to do.)

loeg · on April 18, 2020

You're being unnecessarily combative.

> No. You are wrong. There are no words in the standard that say that “basic types cannot overlap in memory”.

§ J.2, Undefined Behavior

An object is assigned to an inexactly overlapping object or to an exactly overlapping object with incompatible type (6.5.16.1).

> There is not even a notion of “basic type”.

"Object."

You repeatedly (in this thread, and on your blog) express that you don't really understand "strict" (ISO standard) aliasing rules, and that seems to be the case.

pascal_cuoq · on April 19, 2020

This line of J.2 only refers to the already cited 6.5.16.1.

You keep quoting this clause as if it applied to any of the assignments in the program being discussed.

It doesn't.

That clause says that in an assignment of the form “lvalue1 = lvalue2;”, there must only be exact overlap or no overlap between lvalue1 and lvalue2. This does not apply to assignments of the form “lvalue = 1;” or “lvalue = 2;” which are the interesting assignments in the program being discussed.

Objects are not “basic types” for the original sentence that claimed that “basic types cannot overlap in memory”. Objects overlap in memory all the time.

> You repeatedly (in this thread, and on your blog) express that you don't really understand "strict" (ISO standard) aliasing rules, and that seems to be the case.

If you say so. I'm not the one who thinks that “* p” and “1” overlap.

loeg · on April 19, 2020

> You keep quoting this clause as if it applied to any of the assignments in the program being discussed.

> It doesn't.

You keep asserting that 6.5.16.1 is not relevant, as if it makes it so; but it doesn't. It's your opinion; the assertions are not persuasive.

  void f(void) {
    char *t = malloc(1 + sizeof(int));
    if (!t) abort();
    int *fp = (int*)t;
    int *fq = (int*)(t+1);
    h(fp, fq);

  int h(int *p, int *q){
    *p = 1;
    *q = 1;
    return *p;
  }

Please explain to me why you continue to believe that is not an object being assigned to an inexactly overlapping object or to an exactly overlapping object with incompatible type?

pascal_cuoq · on April 19, 2020

Please explain to me why you think it is.

The clause says:

“If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined.”

Under “6.5.16.1 Simple assignment”, so this describes a rule about assignment.

Which assignment in the program are you claiming stores in an object a value read from another object that overlaps in any way the storage of the first object?

pascal_cuoq · on April 18, 2020

Author here!

I am not saying or thinking that there is a problem with GCC. I do think that GCC and Clang would be more useful with an option to make them not assume that every pointer is aligned if the target architecture does not impose this, but that's not the same thing as saying there is something wrong with GCC.

The message of the post, rather than “something is wrong with GCC”, is, “Beware. You might think that this is okay to do in your C programs, but it is not and here is why.”

Also before I post something like this, I need a confirmation that the behavior is intended and not accidental. It has happened to me before that I was about to document that GCC had an agressive behavior with respect to a kind of optimization (while remaining arguably in line with the intent of the standard, even if the word of the standard was in this case ambiguous enough to be interpreted any which way), and my co-author and I had to use a “missed optimization” ticket on GCC's bugzilla in order to have them confirm that GCC was doing the thing in question on purpose. GCC's developers, seeing the bug report, changed the behavior to remove the optimization entirely instead: https://gcc.gnu.org/ml/gcc/2016-11/msg00111.html

Coming back to the example at hand, if I had phrased the ticket as “GCC shouldn't optimize this”, it would have been closed instantly as “well it's UB”. I hoped for a more interesting search for a trade-off that would satisfy everyone, from people who just want legacy C code to keep working with new compilers to people who want programs to run as fast as possible if I phrased it this way.

(And yes, you have to ask in the bugzilla if you need some sort of official answer for this kind of thing. If you ask on a mailing list, you'll get a “no that was UB from the start” answer from someone you have never heard of who is in fact a power user who subscribed to the mailing list, and whose opinion, while useful, should not be assumed to be that of the compiler developers.)

joppy · on April 19, 2020

Do you know if there is a compiler switch that can insert run-time checks any time it makes assumptions which could be invalid (such as “this random pointer is aligned”) and abort with an error message (or something) when it is not true? I think this would be invaluable for tracking down odd bugs caused by things like this.

pascal_cuoq · on April 20, 2020

I have seen it said in another thread that UBSan detects this.

If you aren't already using all the sanitizers that come with your {CLang, GCC} compiler, you should! They are great!

UBSan detects everything that can be detected without metadata. It would be its job to find this, since this is a simple mask to apply and test at each pointer access.

UBSan cannot detect if memory is initialized or if a pointer is valid, because these questions cannot be answered locally, looking only at the instruction doing the access. You need metadata for this. The sanitizers that maintain the metadata to answer these questions are respectively MSan and ASan. Their heavy instrumentations are incompatible, so you can only use one at a time.

pascal_cuoq · on April 20, 2020

UBSan does detect and report misaligned pointer accesses in the latest versions of GCC and Clang:

https://gcc.godbolt.org/z/xpSbXL

pascal_cuoq · on April 14, 2020

The problem in practice is that you do not write “hello” and “world” to the destination buffer. You write data that is computed more or less directly from user inputs. Often a malicious user.

So the user only needs to find a way to make the data longer than the developer expected. This may be very simple: the developer may have written a screensaver to accept 20 characters for a password, because who has a longer password than this? Everyone knows that only the first 8 characters matter anyway. (This may have been literally true a long time ago, I think, although it's terrible design. Anyway only 8 characters of hash were stored, so in a sense characters after the first 8 did not buy you as much security as the first 8, even if it was not literally true.)

And this is how there were screensavers that, when you input ~500 characters into the password field, would simply crash and leave the applications they were hiding visible and ready for user input. This is an actual security bug that has happened in actual Unix screensavers. The screensavers were written in C.

And long story short, we have been having the exact same problem approximately once a week for the last 25 years. Many people agree that it is urgent to finally fix this, especially as the consequences are getting worse and worse as computers are more connected.

One solution that some favor is functions that make it easier not to overflow buffers because you tell them the size of the buffer instead of trying to guess in advance how much is enough for all possible data that may be written in the buffer. This is the thing being discussed in this thread. The function sprintf is not a contender in this discussion. The function snprintf could be, if used wisely, but it is a bit unwieldy and the OP's proposal has a specific advantage: you compute the end pointer only once, because this is the invariant.

hyc_symas · on April 15, 2020

An analogous seprintf() would probably be a good thing to add too, where the buffer end is passed in instead of a buffer length. I would still have it return a pointer to the end of what was copied. Anyone can calculate the length if they need to, by subtracting the original pointer from the returned pointer.

    char *seprintf(char *str, char *end, const char *format, ...);

souprock · on April 15, 2020

I think sprintf and gets can be perfectly secure interfaces. The standard just needs to specify them in a way that causes overflows to raise signals. This is probably more for POSIX and UNIX, since I think it requires the concept of memory mappings. For example:

Start by specifying that memcpy goes by increasing address. This can be done by specifying that no pages to be written by memcpy can be written to until after all pages with lower addresses have been accessed by memcpy. (it is OK to read forwards and then write backwards; the first access must not skip pages)

Next, specify sprintf and gets in terms of memcpy. The output is written as if by memcpy.

The user may then place a PROT_NONE page of memory after the buffer. Since the pages are being accessed by address order, the PROT_NONE page will safely stop the buffer overflow. The user can have a signal handler deal with the problem. It can exit or map in more memory. If we require sprintf and gets to be async-signal-safe, then the signal handler can also siglongjmp out of the problem.

saagarjha · on April 15, 2020

Surely you don’t expect every stack buffer to have a hard page placed after it to protect from overflows?

pascal_cuoq · on April 14, 2020

If you wrote down your proposal, which the C committee member Robert Seacord is encouraging you to do here: https://news.ycombinator.com/item?id=22870210 , you would have to think carefully about functions that are pure according to your definition (free from side effects and only uses its inputs) but do not terminate for some inputs.

There is at least one incorrect optimization present in Clang because of this (function that has no side-effects detected as pure, and call to that function omitted from a caller on this basis, when in fact the function may not terminate).

temac · on April 14, 2020

I thought the compiler was free to pretend loops without side effects always terminate, and in that sense it is already a "correct" optimization? Or is it only for C++, I'm not sure?

pascal_cuoq · on April 14, 2020

That may be the case in C++, but in C infinite loops are allowed as long as the controlling condition is a constant expression (making it clear that the developper intends an infinite loop). These infinite loops without side-effects are even useful from time to time in embedded software, so it was natural for the committee to allow them: https://port70.net/~nsz/c/c11/n1570.html#6.8.5p6

And you now have all the details of the Clang bug, by the way: write an infinite loop without side-effects in a C function, then call the function from another C function, without using its result.

pascal_cuoq · on April 14, 2020

Thanks Dan, I missed this question in the heat of the moment.