Hacker News new | past | comments | ask | show | jobs | submit login
GCC Non-bugs (gcc.gnu.org)
63 points by chris_wot on March 1, 2015 | hide | past | favorite | 40 comments



> Casting does not work as expected when optimization is turned on.

> This is often caused by a violation of aliasing rules, which are part of the ISO C standard. These rules say that a program is invalid if you try to access a variable through a pointer of an incompatible type.

Regarding this, is this what reinterpret_cast was at least partially designed for?


Kind of, casts in C++ were designed to be on your face, to call out that you are doing something unsafe.

reinterpret_cast is meant to be really unsafe, to meant that in this type of casts the compiler will blindly do it, whereas the other *_cast there is still some type validation involved.


Thanks for the clarification. So this doesn't signal to the compiler that it shouldn't optimize the aliased variables?

Reason I ask and not test is that it seems that VC++ does not take advantage of this undefined behavior and the sample code always works 'as expected'.


This type of optimization is very compiler specific and can even change between versions of the same compiler.

Actually this is the big difference between standard defined languages and those that rely on a single implementation.

There is nothing to rely upon if it isn't defined in the standard as such.


I know this is a problem with many programming languages, but why is there not a standard way to precisely represent decimal numbers? I understand the difficulty representing them in binary (well, kind of -- it's been a while since I read about it), but it seems like knowing this, a better solution would be found. Why is this accepted and standard behavior?


Are you familiar with the IEEE 754-2008 specification of decimal floating point format?

From Wikipedia's entry on decimal64[1]: "It is intended for applications where it is necessary to emulate decimal rounding exactly, such as financial and tax computations."

[1] http://en.wikipedia.org/wiki/Decimal64_floating-point_format


No, I was not. I'm assuming that the standard double or float do not conform to this spec, though, right?


Decimal numbers are supported by many languages (python has the Decimal library for instance).

The problem is that decimal isn't supported by modern hardware so it's not supported natively by C which adds very little to hardware representations.


Sure there is. It's called Binary-coded decimal, and its been around since the 60's (at the latest).


BCD is orthogonal to this.


If by "decimal numbers" you mean the reals written in base 10, how do you propose we precisely represent the result of 1 / 3 . Or pi?


I guess I wasn't necessary referring to solving the problem of non-terminating numbers, but more things like 0.1 * 0.2 = 0.020000000000000004 in Javascript, or the very first bug listed on the linked page.

Perhaps what you intend with that statement is that issues like that of 1/3 are a cause for these sorts of issues, but interestingly (at least in JS), 1/3 is terminating, and results in an "expected" value.


Well, every base has issues with non-termination. In base 10, we have problems like 1/3 = 0.33333... . Base 2 has its own set of non-terminating decimal expansions, including 1 / 3 = 0.01010101... and 1 / 10 = 0.0001100110011... . Since we only have finite storage space, we have to cut the repeating string of digits off somewhere, no matter what base we're in. This can cause rounding issues.

You can't dodge all your rounding problems by changing base. There are binary-coded decimal (BCD) systems that store numbers as strings of decimal digits. You can generally find these in calculators (ever notice that a TI-84 overflows after 9.(9)e99) or some financial software. However, you'll still have problems like the 0.1 * 0.2 issue in binary floating-point. For example (assuming shorter-than-usual numbers):

2/3 = 0.66... ~ 0.66667

2/3 + 2/3 ~ 0.66667 + 0.66667 = 1.33334

However, 4/3 = 1.33333... ~ 1.33333, which isn't the same result.

Basically, you can't get away from this problem, you can only push it around to cases you care less about.

(You probably can't find an API for binary-coded decimal in $LANGUAGE unless $LANGUAGE is often used for tasks where you really really don't want any float-related gotchas since most everyone else doesn't care too much).


You could represent your numbers as fractions of integers and do all operations on them as fractions of integers: https://gmplib.org/manual/Rational-Number-Functions.html#Rat...

And only converting them to floating point for display purposes: http://www.mpfr.org/mpfr-current/mpfr.html#index-mpfr_005fse...


Hmm, so BCD seems closer to what I was roughly imagining (though, I haven't really thought this issue through deeply or anything), but you mention it has shortcomings as well. I guess I'll just have to trust that other people have thought about this, and have determined this to be the best solution, even with its flaws.


The result of the computation 1/3 in JS results in a number which is terminating, but that number is not 1/3. Open up your console and type 1/3+1/3+1/3... 1, as expected. Then type 1-1/3-1/3-1/3... I get about 1e-16, which is not 0.

The statement that "1/3 is terminating" is not true in binary, it is not true in decimal, and the only reason that you sometimes get the expected results is that sometimes the rounding errors will cancel out.


I realize that 1/3 is not terminating in decimal, what I meant was that in JS, it seemed to end neatly after x decimal places, which I found interesting. I hadn't tried the operation you mentioned until you posted that, the results are even more bizarre to me. Seems I need to spend some time relearning floating point math to better understand what I'm seeing.


So you're telling me we should use base 3.


Well then 0.5 would be non-terminating...


floating point is just one way of representing non-integral numbers, you can also use fixed point, rational types (integer numerator and denominator), and there are also types for arbitrary precision decimals, like gmp or BigDecimal.


Something I found yesterday:

static char * s = * &"hello";

This should not compile, clang refuses it, and it is not part of the C standard. GCC accepts it, but is it worth even submitting a bug report?


Clang 3.5 compiles that for me without warnings even with -Weverything. String literals in C (but not C++) aren't const for legacy reasons despite not being modifiable.


Sorry, I made a mistake. I think it is:

static char s[] = * &"foo";

The c standard says char array initializers are either a string, or a string surrounded by optional '{}' .


-pedantic gives "warning: array initialized from parenthesized string constant" for that, so I'm guessing it's an unintended consequence of a nonstandard extension. Might be worth reporting since at the minimum the warning is wrong.


In my embedded development, I have trained myself to use unions instead of type-punned-pointer casts to access one data type as another.

This document says this is a GCC-specific extension. Is this true? Or is it one of those things that's not standardized, but the compiler vendors all do it anyway?


Type punning by pointer is explicitly UB per the standard.

Type punning by union is implementation defined behavior. Everyone fortunately defines it to Do The Right Thing (TM).

The truly standard supported way to type pun is by memcpy

float f = ...; int x; memcpy(&x, &f, sizeof x);


Yes. Use memcpy if you need to do this! This is especially important when you're on a platform that requires aligned pointers. For example, the following code will crash on ARM

    char bytes[5] = {0};
    float flt = *(float*)&(bytes[1]);
Using memcpy works on any platform with 4 byte floats:

    char bytes[5] = {0};
    float flt;
    memcpy(&flt, &(bytes[1]), 4);


But the union won't crash either (the compiler will keep it aligned), the conversion compiles to no-op, and doesn't waste memory.

Embedded targets are always memory-constrained. If they aren't, you are wasting money on hardware.


I was thinking about the case when you get unaligned data (eg you are reading from a file or from the network). Then you need to copy the data anyway.


It's legal C99. See DR258 / C99+TC3. But be careful, reading trap representations is UB.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf


Type-punning via unions is allowed in both C99 [0] and C11.

[0] see "ISO/IEC 9899:1999 Technical Corrigendum 3", http://www.open-std.org/Jtc1/sc22/wg14/www/docs/n1235.pdf


"Most C++ compilers (G++ included) do not yet implement export". For me, non compliance with such an old spec is a bug.


There is a single C++ compiler that supports export. Out of all the C++ compilers, only a single one supports it. Mainly because it is such a HUGE pain in the ass to write support for.

See this paper as to why it was removed from the spec: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n142...


Those guys had serious dedication to essentially waste 3 years implementing it, just to say how useless it is.


"John Spicer notes: “Any program written using export can be rearranged trivially into a program that doesn’t use export.”"

This makes me wonder why they couldn't have implemented this rearrangement as export...


Probably trivially with some human judgment.


That is a great counterpoint to this comment I saw recently:

https://news.ycombinator.com/item?id=9126237


It's more like export was a bug in C++98. It was removed without a deprecation period in C++11.


I didn't know. My last professional development in C++ was in 2002. At that time http://www.comeaucomputing.com/ was my reference.


Turns out a lot has changed in the c++ world in the last 15 years. (trigraphs are going away too)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: