Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a bad answer too, IMO.

I think there is a solid case for the existence of undefined behavior; even Rust has it, it's nothing absurd in concept, and you do describe some reasoning for why it should probably exist.

However, and here's the real kicker, it really does not need to exist for this case. The real reason it exists for this case is due to increasingly glaring deficiencies in the C++ language, namely, again, the lack of any form of pattern matching for control flow. Because of this, there's no way for a library author, including the STL itself, to actually handle this situation succinctly.

Undefined behavior indeed should exist, but not for common cases like "oops, I didn't check to see if there was actually a value here before accessing it." Armed with a moderately sufficient programming language, the compiler can handle that. Undefined behavior should be more like "I know you (the compiler) can't know this is safe, but I already know that this unsafe thing I'm doing is actually correct, so don't generate safeguards for me; let what happens, happen." This is what modern programming languages aim to do. C++ does that for shit like basic arithmetic, and that's why we get to have the same fucking CVEs for 20+ years, over and over in an endless loop. "Just get better at programming" is a nice platitude, but it doesn't work. Even if it was possible for me to become absolutely perfect and simply just never make any mistakes ever (lol) it doesn't matter because there's no chance in hell you'll ever manage that across a meaningful segment of the industry, including the parts of the industry you depend on (like your OS, or cryptography libraries, and so on...)

And I don't think the issue is that the STL "doesn't care" about the possibility that you might accidentally do something that makes no sense. Seriously, take a look at the design of std::variant: it is pretty obvious that they wanted to design a "safe" union. In fact, what the hell would the point of designing another unsafe union be in the first place? So they go the other route. std::variant has getters that throw exceptions on bad accesses instead of undefined behavior. This is literally the exact same type of problem that std::expected has. std::expected is essentially just a special case of a type-safe union with exactly two possible values, an expected and unexpected value (though since std::variant is tagged off of types, there is the obvious caveat that std::expected isn't quite a subset of std::variant, since std::expected could have the same type for both the expected and unexpected values.)

So, what's wrong? Here's what's wrong. C++ Modules were first proposed in 2004[1]. C++20 finally introduced a version of modules and lo and behold, they mostly suck[2] and mostly aren't used by anyone (Seriously: they're not even fully supported by CMake right now.) Andrei Alexandrescu has been talking about std::expected since at least 2018[3] and it just now finally managed to get into the standard in C++23, and god knows if anyone will ever actually use it. And finally, pattern matching was originally proposed by none other than Bjarne himself (and Gabriel Dos Reis) in 2019[4] and who knows when it will make it into the standard. (I hope soon enough so it can be adopted before the heat death of the Universe, but I think that's only if we get exceptionally lucky.)

Now I'm not saying that adding new and bold features to a language as old and complex as C++ could possibly ever be easy or quick, but the pace that C++ evolves at is sometimes so slow that it's hard to come to any conclusion other than that the C++ standard and the process behind it is simply broken. It's just that simple. I don't care what changes it would take to get things moving more efficiently: it's not my job to figure that out. It doesn't matter why, either. The point is, at the end of the day, it can't take this long for features to land just for them to wind up not even being very good, and there are plenty of other programming languages that have done better with less resources.

I think it's obvious at this point that C++ will never get a handle on all of the undefined behavior; they've just introduced far too much undefined behavior all throughout the language and standard library in ways that are going to be hard to fix, especially while maintaining backwards compatibility. It should go without saying that a meaningful "safe" subset of C++ that can guarantee safety from memory errors, concurrency errors or most types of undefined behavior is simply never going to happen. Ever. It's not that it isn't possible to do, or that it's not worth doing, it's that C++ won't. (And yes, I'm aware of the attempts at this; they didn't work.)

The uncontrolled proliferation of undefined behavior is ultimately what is killing C++, and a lot of very trivial cases could be avoided, if only the language was capable of it, but it's not.

[1]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n17...

[2]: https://vector-of-bool.github.io/2019/01/27/modules-doa.html

[3]: https://www.youtube.com/watch?v=PH4WBuE1BHI

[4]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p13...



I cannot follow your rant... I'll do my best to respond, but I'm probably not understanding something.

Divide by zero must be undefined behavior in any performant language. On x86 you either have a if before running the divide (which of course in some cases the compiler can optimize out, but only if it can determine the value is not zero); or you the CPU will trap into the OS - different OSes handle this in different ways, but most not in a while that makes it possible to figure out where you were and thus do something about it. This just came up in the C++ std-proposals mailing list in the past couple weeks.

AFAIK all common CPUs have the same behavior on integer overflow (two-complement). However in almost all cases (again, some encryption code is an exception) that behavior is useless to real code and so if it happens your code has a bug either way. Thus we may as well let compilers optimize assuming it cannot happen as it if it does you have a bug no matter what we define it as. (C++ is used on CPUs that are not two-complement as well, but we could call this implementation defined or unspecified, but it doesn't change that you have a bug if you invoke it.)

For std::expected - new benchmarks are proving in the real world, and with optimized exception handlers that exceptions are faster in the real world than systems that use things like expected. Microbenchmarks that show exceptions are slower are easy to create, but real world exceptions that unwind more than a couple function calls show different results.

As for modules, support is finally here and early adopters are using it. The road was long, but it is finally proving it worked.

Long roads are a good thing. C++ has avoided a lot of bad designs by spending a lot of time thinking about problems about things for a long time. Details often matter and move fast languages tend to run into problems when something doesn't work as well as they want. I'm glad C++ standardization is slow - it already is a mess without add more half backed features to the language.


The problem is 'undefined behaviour' is far too powerful.

Why not make division by zero implementation defined. I'm happy with my compiler telling me my program will get terminated if I divide by zero, no problem. Let's even say it "may" be terminated (because maybe the division is optimised out if we don't actually need to calculate it, fine).

My problem is that UB let's compilers do all kinds of weird things, like assume if I write:

    int dividebyzero = 0;
    if(y == 0) { dividebyzero = 1; }
    z=x/y;
Then set dividebyzero to always be 0, because 'obviously' y can't be 0, because then I would invoke undefined behaviour.

Also, for two-complement, it's fairly common people want wrapping behaviour. Also, I don't think basically anyone is using C++ on non-twos complement CPUs (both gcc and clang don't support it), and even if it does run on such CPUs, why not still require a well-defined behaviour, in the same way C++ runs on 32-bit and 64-bit systems, but we don't say asking for the size of a pointer is undefined behaviour -- everyone just defines what it is on their system!


> Divide by zero must be undefined behavior in any performant language. On x86 you either have a if before running the divide (which of course in some cases the compiler can optimize out, but only if it can determine the value is not zero); or you the CPU will trap into the OS - different OSes handle this in different ways, but most not in a while that makes it possible to figure out where you were and thus do something about it. This just came up in the C++ std-proposals mailing list in the past couple weeks.

I mean look, I already agree that it's not necessarily unreasonable to have undefined behavior, but this statement is purely false. You absolutely can eat your cake and have it too. Here's how:

- Split the operation in two: safe, checked division, and fast, unchecked division.

- OR, Stronger typing; a "not-zero" type that represents a numeric type where you can guarantee the value isn't zero. If you can't eat the cost of runtime checks, you can unsafely cast to this.

I think the former is a good fit for C++.

C++ does not have to do what Rust does, but for sake of argument, let's talk about it. What Rust does here is simple, it just defines divide-by-zero to panic. How? Multiple ways:

- If it knows statically it will panic, that's a compilation error.

- If it knows statically it can not be zero, it generates unchecked division.

- If it does not know statically, it generates a branch. (Though it is free to implement this however it wants; could be done using CPU exceptions/traps if they wanted.)

What if you really do need "unsafe" division? Well, that is possible, with unchecked_div. Most people do not need unchecked_div. If you think you do but you haven't benchmarked yet, you do not. It doesn't get any simpler than that. This is especially the case if you're working on modern CPUs with massive pipelines and branch predictors; a lot of these checks wind up having a very close to zero cost.

> AFAIK all common CPUs have the same behavior on integer overflow (two-complement). However in almost all cases (again, some encryption code is an exception) that behavior is useless to real code and so if it happens your code has a bug either way. Thus we may as well let compilers optimize assuming it cannot happen as it if it does you have a bug no matter what we define it as. (C++ is used on CPUs that are not two-complement as well, but we could call this implementation defined or unspecified, but it doesn't change that you have a bug if you invoke it.)

It would be better to just do checked arithmetic by default; the compiler can often statically eliminate the checks, you can opt out of them if you need performance and know what you're doing, and the cost of checks is unlikely to be noticed on modern processors.

It doesn't matter that this usually isn't a problem. It only has to be a problem once to cause a serious CVE. (Spoiler alert: it has happened more than once.)

> For std::expected - new benchmarks are proving in the real world, and with optimized exception handlers that exceptions are faster in the real world than systems that use things like expected. Microbenchmarks that show exceptions are slower are easy to create, but real world exceptions that unwind more than a couple function calls show different results.

You can always use stack unwinding or exceptions if you want to; that's also present in Rust too, in the form of panic. The nice thing about something like std::expected is that it theoretically can bridge the gap between code that uses exceptions and code that doesn't: you can catch an exception and stuff it into the `e` of an std::expected value, or you can take the `e` value of an std::expected and throw it. In theory this should not have much higher cost than simply throwing.

> As for modules, support is finally here and early adopters are using it. The road was long, but it is finally proving it worked.

Last I was at Google, they seemed to have ruled out C++ modules because as-designed they are basically guaranteed to make compilation times worse.

For CMake, you can't really rely on C++ Modules. Firstly, the Makefile generator which is default on most platforms literally does not and as far as I know will not support C++ Modules. Secondly, it doesn't support header units or importing the STL as modules. For all intents and purposes, it would be difficult to even use this for anything.

For Bazel, there is no C++ Modules support to my knowledge.

While fact-checking myself, I found this handy website:

https://arewemodulesyet.org/tools/

...which shows CMake as supporting modules, green check mark, no notes! So that really makes me wonder what value you can place on the other green checkmarks.

> Long roads are a good thing. C++ has avoided a lot of bad designs by spending a lot of time thinking about problems about things for a long time. Details often matter and move fast languages tend to run into problems when something doesn't work as well as they want. I'm glad C++ standardization is slow - it already is a mess without add more half backed features to the language.

I'm glad you are happy with the C++ standardization process. I'm not. Not only do things take many years, they're also half-baked at the end of the process. You're right that C++ still winds up with a huge mess of half-baked features even with as slow as the development process is, and modules are a great example of that.

The true answer is that the C++ committee is a fucking mess. I won't sit here and try to make that argument; plenty of people have done a damningly good job at it better than I ever could. What I will say is that C faces a lot of similar problems to C++ and somehow still manages to make better progress anyways. The failure of the C++ standard committee could be told in many different ways. A good relatively recent example is the success of the #embed directive[1]. Of course, the reason why it was successful was because it was added to C instead of C++.

Why can't C++ do that? I dunno. Ask Bjarne and friends.

[1]: https://thephd.dev/finally-embed-in-c23


> What if you really do need "unsafe" division? Well, that is possible, with unchecked_div. Most people do not need unchecked_div. If you think you do but you haven't benchmarked yet, you do not. It doesn't get any simpler than that.

This attitude is why modern software is dogshit slow. People make this "if you haven't benchmarked, it doesn't matter" argument thousands of times and the result is that every program I run is slower than molasses in Siberia. I don't care about "safety" at the expense of performance.


> This attitude is why modern software is dogshit slow.

Bullshit. Here's my proof: We don't even do this. There's a ton of software that isn't safe from undefined behavior and it's still slow as shit.

> People make this "if you haven't benchmarked, it doesn't matter" argument thousands of times and the result is that every program I run is slower than molasses in Siberia. I don't care about "safety" at the expense of performance.

If you can't imagine a world where there's nuance between "we should occasionally eat 0.5-1ns on checking an unsafe division" and "We should ship an entire web browser with every text editor and chat app" the problem is with you. If you want your software to be fast, it has to be benchmarked, just like if you want it to be stable, it has to be tested. There's really no exceptions here, you can't just guess these things.


Modern software is dogshit slow because it's a pile of JS dependencies twenty layers deep.

Checked arithmetic is plenty fast. And as for safety vs performance, quickly computing the incorrect result is strictly less useful than slowly computing the correct one.


Google has a odd C++ style guide that rules out a lot of useful things for their own reasons.

There is no reason why make could not work with modules if someone wanted to go through the effort. The CMake people have even outlined what needs to be done. Ninja is so much nicer that you should switch anyway - I did more than 10 years ago.


I do use Ninja when I use CMake, but honestly that mostly comes down to the fact that the Makefiles generated by CMake are horrifically slow. I don't particularly love CMake, I only use it because the C++ ecosystem really has nothing better to offer. (And there's no chance I'm going to redistribute a project that can't build with the Makefile generator, at least unless and until Ninja is default.)

Anyway, the Google C++ style guide has nothing to do with why C++ modules aren't and won't be used at Google, it's because as-implemented modules are not an obvious win. They can theoretically improve performance, but they can and do also make some cases worse than before.

I don't think most organizations will adopt modules at this rate. I suspect the early adopters will wind up being the only adopters for this one.


I agree very much with what you wrote.

> the lack of any form of pattern matching for control flow

Growing features after the fact is hard. Look at the monumental effort to get generics into Go. Look at how even though Python 3.10 introduced the match statement, it is a statement and not an expression - you can't write `x = match ...`, unlike Rust and Java 14. So it doesn't surprise me that C++ struggles with this.

> Undefined behavior indeed should exist

Agreed. Rust throws up its hands in narrow cases ( https://doc.rust-lang.org/reference/behavior-considered-unde... ), and even Java says that calling Thread.stop() and forcing monitor unlocks can lead to corrupted data and UB.

> but not for common cases like

Yes, C/C++ have far, far too many UB cases. Even down to idiotically simple things like "failing to end a source file with newline". C and C++ have liberally sprinkled UB as a cop-out like no other language.

> C++ does that for shit like basic arithmetic

I spent an unhealthy amount of time understanding the rules of integer types and arithmetic in C/C++. Other languages like Rust are as capable without the extreme mental complexity. https://www.nayuki.io/page/summary-of-c-cpp-integer-rules

Oh and, `(uint16_t)0xFFFF * (uint16_t)0xFFFF` will cause a signed 32-bit integer overflow on most platforms, and that is UB and will eat your baby. Scared yet? C/C++ rules are batshit insane.

> "Just get better at programming" is a nice platitude, but it doesn't work.

Correct. Far too often, I hear a conversation like "C/C++ have too many UB, why can't we make it safer?" "Just learn to write better code, dumbass". No, literal decades of watching the industry tells us that the same mistakes keep happening over and over again. The evidence is overwhelming that the languages need to change, not the programmers.

> it's obvious at this point that C++ will never get a handle on all of the undefined behavior; they've just introduced far too much undefined behavior all throughout the language and standard library

True.

> in ways that are going to be hard to fix, especially while maintaining backwards compatibility

Technically not true. Specifying undefined behavior is easy, and this has already been done in many ways. For example, -fwrapv makes signed overflow defined to wrap around. For example, you could zero-initialize every local variable and change malloc() to behave like calloc(), so that reading uninitialized memory always returns zero. And because the previous behavior was undefined anyway, literally any substitute behavior is valid.

The problem isn't maintaining backward compatibility, it's maintaining performance compatibility. Allegedly, undefined behavior allows the compiler to optimize out redundant arithmetic, redundant null checks, etc. I believe this is what stops the standards committees from simply defining some kind of reasonable behavior for what is currently considered UB.

> a meaningful "safe" subset of C++ that can guarantee safety from memory errors, concurrency errors or most types of undefined behavior is simply never going to happen

I think it has already happened. Fil-C seems like a capable approach to transpile C/C++ and add a managed runtime - and without much overhead. https://github.com/pizlonator/llvm-project-deluge/blob/delug...

> The uncontrolled proliferation of undefined behavior is ultimately what is killing C++

It's death by a thousand cuts, and it hurts language learners the most. I can write C and C++ code without UB, but it took me a long time to get there - with a lot of education and practice. And UB-free code can be awkward to write. The worst part of it is that the knowledge is very C/C++-specific and is useless in other languages because they don't have those classes of UB to begin with.

I dabbled in C++ programming for about 10 years before I discovered Rust. Once I wrote my first few Rust programs, I was hooked. Suddenly, I stopped worrying about all the stupid complexities and language minutiae of C++. Rust just made sense out of the box. It provided far fewer ways to do things ( https://www.nayuki.io/page/near-duplicate-features-of-cplusp... ), and the easy way is usually the safe and correct way.

To me, Rust is C++ done right. It has the expressive power and compactness of C++ but almost none of the downsides. It is the true intellectual successor to C++. C++ needs to hurry up and die already.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: