Hacker News new | past | comments | ask | show | jobs | submit login

#1. Tentative definitions are historical baggage from Fortan Common blocks (https://blogs.oracle.com/ali/entry/what_are_tentative_symbol...), which may have helped adoption of C by Fortran users. Although it remains in the C language specification, this feature can be ignored.

#2. Treating dereferencing a NULL pointer as undefined behavior means that the compiler is not required to generate additional instructions such as asserts or crashes to guard against potentially dangerous side effects. C compiler assumes that the programmer is in control of his/her code. In this example, it can be assumed that a careful C programmer has already guaranteed that the pointer will not be null when it is dereferenced. This C feature is an optimization to avoid generating redundant or unnecessary asserts or handling code.

3. C allows the programmer to handing pointers, allowing for such low-level optimizations that may not be possible in higher-level languages. A careful C programmer may have taken steps outside the function to handle the situation where yp==zp, or may otherwise be unconcerned about a particular case, for performance reasons.

4. A correct implementation of IEEE 754.

5. Since C is designed to efficiently compile to any computer architecture, it needs to be aware of the distinction between the arithmetic width of an instruction set vs the width of addresses. Ints are optimized to default to the natural arithmetic width of an instruction set (so that compilation doesn't produce unnecessary packing/unpacking instructions whenever they are accessed) but is guaranteed to be atleast 16 bits wide. However, since data structure can be as large as addressable memory, it is necessary for size_t to be the width of addresses.

6. Allowing size_t to be unsigned allows all bits of a size_t variable to be utilized for expressing size.

7. Undefined behavior is, again, an optimization feature, allowing each compiler to implement as it sees fit.

8. Comma operator is useful when first operand has desirable side effects, such as compactly representing parallel assignment or side effects in for loops.

9. C allows unsigned integers to wrap around 0 and UINT_MAX. This feature can be utilized as an optimization, for example as a free (no additional instruction) deliberate modulus operation. This is usually how unsigned integers behave in assembly.

10 & 11 & 12. Some ISA's, like MIPS, treat overflow of signed numbers as an exception. Others simply treat the result as a valid two's-compliment value. Since C is machine independent, C's official specification for overlow of signed numbers must be compatible for all ISA's. Simply treating the result as undefined does the trick, and means the compiler doesn't have to make guarantees or version for each ISA.




5. size_t doesn't necessarily have to be the width of an address. An address (say, a value of type void* or char) has to be able to refer to any byte of any object. A size_t only has to be able to represent the size of any single* object. The limit on the size of a single object and the limit on the total size of memory are often the same on modern systems, but C allows them to be different (think segments).

6. size_t is required to be unsigned.

9. C requires wraparound behavior for unsigned integers.

10, 11, 12. It's not just the result of an overflowing signed integer arithmetic operation that's undefined, it's the behavior. `INT_MAX + 1` can yield `INT_MIN`, or it can yield 42, or it can crash your program and reformat your hard drive (at least in principle).


I forgot about segments... Thanks for clarifications...I guess I don't know C. :)


Good comment, but I'd like to add that int is not really the "natural arithmetic width of an instruction set" any more. We will never see 64-bit ints. The sizes of int and long seem to be "whatever works, and is compatible with what it used to be".


I've seen 64-bit ints (on Cray systems).

One disadvantage of making int 64 bits, even if that's the natural size, is that if char is 8 bits, then short has to be either 16 or 32 bits (or 64) -- which means that you can't have predefined types covering all the common sizes (8, 16, 32, 64).

That's not quite true, since C99 introduced extended integer types -- but I don't know of any C compiler that has provided them.

(The intN_t and uintN_t types in <stdint.h> don't solve this; they still have to be defined in terms of existing types.)


thanks for clarification!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: