Hacker News new | past | comments | ask | show | jobs | submit login
The copy and swap idiom in C++ (sandordargo.com)
45 points by todsacerdoti on Aug 12, 2022 | hide | past | favorite | 53 comments



I used to teach C++ courses as a consultant. This kind of thing was part of the gospel I was trying to spread. Educate my follow software developers on the merits of good C++ memory management. It all felt so powerful and cool.

But after some time you realise that nobody is smart enough to keep all of these idioms and arbitrary C++ rules in their head. The only reason I could do it was because I put so much time into preparing my courses. So all of your collegues will keep writing terrible C++ code. And you will be left frustrated that nobody is "doing C++ right".

I guess one could chill down and accept the chaos. I chose to leave C++ behind.


I've been using C++ for around 20 years now. I've spent a lot of time learning the language, although I don't know any of C++20 yet.

The issue with C++ is that the 'proper' way to do something almost always requires an encyclopedic knowledge of the language; any other solution you come up with will likely be wrong or at the very least suboptimal for esoteric reasons. It's not that people aren't smart enough, it's that there's not enough time to really learn the language. I know I've sacrificed sleep and social life just doing deep dives into the language to really understand it. The reason I don't know C++20 yet is because I'm not willing to make those personal sacrifices anymore, and even if we could use C++20 where I work we don't have infinite time to learn it before being productive.


This. That encyclopedic knowledge of gotcha’s and work arounds and such is why people hate C++ so much. It’s far beyond tolerable. If you were to restrict everything to just c++20 (force shard_ptr, etc) it would still be something only battle hardened adventurers would pursue. Languages like Go and Nim and the all mighty Rust try to take away complexity for a bit of convention and provide safety for us mortals that just want to get an idea across.

For those that are weathering the storm in C++, I commend you, but you’re going to need therapy. (joke but the effects of stress are real).


C++17 and especially C++20 are reasonable programming languages that you rarely have to fight or work around. Things have become much simpler. Anything prior to C++11 was sufficiently awful to work in that I used other languages. They really resurrected the language with C++11.

It can be messy, because backward compatibility, but a lot of newer code bases start time with idiomatic C++17. The standard library is the primary place you'll still run into old style C++.


I would love to see one such code base, that isn't my own hobby coding.

The libraries we get to plug into our infrastructures show another reality.

Or in a better way, see the usual talks about the state of teaching C++, and the quixotic endevours to improve it.


> For those that are weathering the storm in C++, I commend you, but you’re going to need therapy. (joke but the effects of stress are real).

Every new release of C++ tends to reduce, not increase, my stress. Many of these patterns, including this one, just largely stop being things entirely. Rule of zero and all that.


I somehow feel Rust is even more complex than C++

For C++ I settle down with c++17|c++20 and they're still complex, but not that bad though.

Go might be simpler but C++ can do whatever Go does and I tried to master both, it did not work, so I had to pick one and that one has to be c++ for my use cases.


Getting around Rust’s syntax is pretty much the whole battle.


Is Rust syntax really that different? I've only written a little C++, but it seems pretty similar to me. They mostly just switched from `type x = ...` to `let x: type = ...` and let you return stuff by putting an expression at the end of a block, right?


Wait until you need a heap allocated resource counted mutex. Or have a struct that out lives it’s creator.

&mut Arc<Mutex<Option<Box<MyStruct>>>>


Right, but while that's a bunch of stuff the syntax here isn't intimidating.

It's a mutable reference to an Arc of a Mutex of an Option of a Box of MyStruct.

You presumably could build yourself one of these in C++ although since the standard mutex in C++ can't actually protect anything (you're supposed to just remember to lock it anywhere you need to, you know, because C++ is a language for people who never make mistakes) you would need to build that part yourself.

And if you did build that yourself in C++ the syntax for the type would look rather similar, although I guess C++ allows you to just say auto meaning, "I dunno, guess" for the return type in a function signature whereas Rust insists you must actually write return types and don't leave them for the compiler to figure out.


And in Go, it's inferred and you don't have to bother repeating yourself.


Rust deliberately chooses not to infer the return type in a function signature.

Inferring return types is convenient for the person writing the function, but adds cognitive overhead for people using the function and makes it easier to introduce incompatible changes without realising (the signature you wrote is unchanged but a different signature is now inferred)


Not arguing that. Indeed forcing type signatures helps the reader at the expense of the writer (who has the burden of documentation? S/He who writes the types).

I’m saying, to a person with less skills than I, it’s an almost insurmountable undertaking to learn Rust while Go at least seems somewhat familiar. They aren’t teaching these in universities I visit so new engineers might have a tough time grokking the syntax.

I prefer Go for two reasons. 1) It’s faster for me to get my idea across (this is my personal hurtle) and 2) Go’s concurrency model.

What I would like to try in Rust: rewriting my game engine. Building a hobby OS. Building a hobby programming language. These are areas where Go isn’t the best tool.


The return type might be inferred, but you're definitely supposed to remember locks. Nothing in the language will prevent you from accessing a protected resource without locking the lock.

In Rust that can be made impossible.


Making a locking thread safe slice or map is as easy at creating a struct with two fields and three member functions… then you can access it and it will lock itself and unlock itself.


From Rust point of view, if you have to write these signatures, something may be off with your architecture. I.e. this is a thing that needs to be mutated from multiple threads, abstracted by a virtual table, and be nullable. And for some reason, this is an exclusive borrow to such pointer. It’s likely that removing any of these constraints would make a positive impact on the architecture.


Indeed. It was just an example I’ve seen in the wild. People tend to try to make the new toy work like the old toy.


And the borrow checker


Hopefully, if you are transitioning from writing C++ to Rust the basic ideas the borrow checker is enforcing feel pretty familiar. Shared mutable references and references which outlive the thing borrowed are bad ideas in C++ too. You should avoid them, and be careful to use a safe pattern to handle them when/if they become necessary, Rust just has the compiler looking over your shoulder instead of an implausibly good code reviewer.


Rc<RefCell<>> and .clone() calls everywhere for most GUI applications.


When I was "seriously" using C++ I created myself a bunch of flashcards (using the Quizlet app) to regularly test myself (spaced repetition) on various C++ idioms and rules; such as when to implement which copy/assignment/move constructors and the correct patterns, how to use std::forward correctly, the footguns in various types of casting or initialization, etc.

I recently had cause to start using C++ again and went over the cards after about a year break. It was depressing how few I could answer correctly.

C++ is great if you're great with it, but that takes a __lot__ of effort, and it can be a minefield otherwise.


I have a notebook I filled with Q&As for every point in The CPP Programming Language that surprised me. It's about 40 pages I occasionally read through, though I've been meaning to convert it to spaced repetition.


My opinion is similar to others in this thread. The complexity you are required to manage in order to push the limits of C++ makes doing so not worth it. There are far too many ways it can bite you, and the benefits you gain by moving beyond a small subset of C++'s capabilities were not worth the hassle for me.

I really wanted them to be worth it, and spent years trying to make it worth it. I guess in the long run it was worth the time spent, because it caused me to explore other possibilities that I might not otherwise have considered (in particular, Lisp and Scheme-like languages, which I doubt I would have otherwise given a second look -- they were just "weird languages that we spent a couple of weeks looking at in college - and oh, yeah what's up with all those freaking parentheses?").


> The complexity you are required to manage in order to push the limits of C++ makes doing so not worth it.

Sometimes the point is just having it possible. Sometimes I need to get that last bit of performance out of some code.

> (in particular, Lisp and Scheme-like languages

If your performance requirements are more in the scripted range then C++ will seem like overkill. I often use python for quick and dirty tools, however every other time I end up rewriting things in C++ because something that should finish almost instantly takes half an hour.


> Sometimes the point is just having it possible. Sometimes I need to get that last bit of performance out of some code

I'm talking about complexity beyond using a small subset of C++ features, things like template metaprogramming (but not limited to that).

> If your performance requirements are more in the scripted range then C++ will seem like overkill. I often use python for quick and dirty tools, however every other time I end up rewriting things in C++ because something that should finish almost instantly takes half an hour.

I have a wide range of requirements depending on what I am working on. I can still drop down to a C or C++ level if I need to, but if I do use C++ it will only be a limited subset that minimizes complexity.

Also, you imply that Lisp and Scheme-like languages are limited to performance on par with Python or other so-called "scripting languages". There are languages in that vein that can perform very close to C's performance (Gambit Scheme being one such example). Often instead of rewriting in a lower-level language, you can use optimizations in the language to improve performance. Of course you can also just rewrite smaller portions of code in C or C++ and call that code from another language.

I'm not someone who just dabbled in C++ for a little while and gave up. I have decades of experience with it. I wanted to use it beyond "C plus a little extra", but I found that the further from a C-like subset I went the more incidental complexity I had to deal with, and in the end the benefits I was looking for just never materialized.


I'm on a similar boat as others here. I used to read all of Bjarne's books back in the day. Now I barely keep up with the language, and only update my knowledge of it enough to keep up with work -- and companies usually lag several standards behind. For my side projects, I've switched to C, which has a much smaller cognitive overload and allows to me focus more on the actual problem, or garbage-collected languages when I just need something quick.


I like C a _lot_, just wish it has more well-established data structure and algorithm collection like c++'s STL library to speed up coding, also wish it had a better way to do RAII-like resource management. Someone please invent c+ that can reuse all existing c code and even some c++ code, but more capable than c, much simpler than c++.


The ideas in the STL were novel enough that when it was originally opposed the C++ committee didn't really "get" it. Generic programming existed since the 1970s, but it wasn't popular until maybe this century.

So there's just no way K&R C gets anything like the STL. Maybe you get stuff like a char* (these days you would write void * but that's not a thing yet) linked list? That does not sound like a better world.


One pattern (not necessarily a good one, but it's definitely used out there) is to define your "generic" code in macros, and then expose a macro such as MAKE_LIST(int) that in this case instantiates a list of int. Then you get a 'type-safe' interface, or at least more type-safe than void*, and also less annoying.

And then you just abuse arrays as much as possible. For a small number of items, e.g. <100 on a modern laptop, that linear and cache-friendly search is going to beat std::map and std::unordered_map and friends.


What if the items in the array are bigger than a cache line?


Yes, of course, my statement has many caveats, you get the point.


C++ is one of my favourite languages, yet I second the feeling, sometimes preaching good practices feels like fighting windmills, so I spend most of my time on languages where I don't need to discuss about the virtues of bounds checking or exceptions every single day.


Nothing about this problem seems unique to C++ or even made more difficult by C++. In most languages it's a challenge to ensure objects stay in a valid state if an exception is thrown in the middle of a series of mutations. But that's also the type of problem that you very rarely need to tackle. It's not like you're making a transactional system with rollbacks every time you want to go poke a few swing gui elements, after all.


Right. But an object being in an inconsistent/invalid state can have more serious consequences in C++ than in other languages, due to the weak typing and manual-by-default memory management. In other languages, the programs may be just as incorrect, but the consequences, when the fault arises, tend to be less fatal. Of course, such a fault can in principle still lead to lost or corrupted data, but that tends to be lesa visible than a segfault.


That'd only be the case if you're not using things like unique_ptr / shared_ptr. Or if you have a really funky entangled ownership model such that a split-copy ends up leaving you in a confused ownership model such that the destructor doesn't do the right thing. And you didn't catch the exception to patch that up before propagating it. That seems like a bit of a stretch to come up with hitting such a scenario, and if you do it doesn't seem like a copy & swap idiom is really the solution.


My sane way to use (non-modern) C++:

- Use simple POD-like structs as much as possible.

- Don’t be afraid of creating simple functions. Not everything has to be a method.

- Only use virtual functions to create pure interfaces, don’t use them for anything more complex

- Don’t use constructors for things more complex than simply initializing fields. For that, use factory functions or separate Init() methods

- Make use of RAII as minimal as possible. Opt for batch construction/deletion instead of RAII-like “single” construction/deletion.

- Turn off exceptions, exit early on error or return an error code/error object if recovery needs to happen. (Related to why you shouldn’t use constructors)

- As a bonus, turn off RTTI. Don’t rely on dynamic_cast, you’re doing something wrong.

- Allocate arrays of objects instead of unique_ptrs/shared_ptrs. Use indices instead of pointers as object handles. When objects are frequently created/deleted and indices frequently get invalidated, use generational indices.

- Ownership should be centralized instead of distributed. A central store should contain arrays of various kinds of objects, and it should take care of most of the memory management (as opposed to the “modern C++” way of scattering objects on heap memory via unique_ptr/shared_ptr as much as possible).


I loved C++, for a while. I chose to leave its grammar's complexity behind too.


Idioms are language warts. There're the result when the language is not expressive enough to solve a certain problem. An idiom basically means "memorize this thing and use it next time you see this problem". Overload "idiom" might be the latest example. When using std::visit(), every time I have to search and put some variadic using thingy with a deduction guide. I'm with your fellow software developers, because I don't memorize any of them and I don't think it's a merit about seniority. I hope C++ doesn't have to have any more idioms in future.


It's absolutely true that C++ is too big to keep in one's head, that doesn't mean people can't or don't care to do excellent programming in it. A good team can create quality products in any language. It's possible that your experience has the slight bias that teams that are already effective in C++ don't hire consultants to teach them C++ courses.


Less is more is what most code should strive for these days. The copy/move ctors and assignment ops should be rare and are often a single resp violation otherwise. Code like optional or smart pointers sure, they are fair game for that, but business logic generally shouldnt bother.


This idiom is largely retired in modern C++.

It's generally better to simply not implement operator= and take the compiler generated default implementation for both copy and move assignment. Then design your class to have semantics such that member-wise assignment does the Right Thing (tm). C style. If you want an idiom for this it's generally called the 'Rule of Zero'

You get swap() for free since std::swap will do the optimal thing.


indeed 'rule of 0' made many idioms go away. just use the standard containers along with smart-pointers for 99% use cases and your c++ life suddenly becomes 99% easier, those come with RAII for free.


Most of the time. In container like code, copy/swap gives easy strong exception guarantee, but that is rarely needed


Achieving exception safety while copying containers, specifically, is generally simple because their states tend to be very easy to reason about. For example, making sure you destroy elements 0 thru [n-1] in a vector if inserting element [n] in to it fails.

Move, and thus swap (since it's 3 moves), should essentially always be noexcept (never throw)


Rule of Zero depends on other classes using rule of five.


I have managed to convince myself to think C++ is actually a DIY Tool Center that offers me an abundant set of tools I can choose from, upon need (project-wise).

This way I know I don't need to learn the whole language at this particular time; only the necessary parts I can use to produce the desired output, which really helps me control my stress.


Post C++11, it is probably better to use the "copy and move" idiom.

    MyClass& operator=(const MyClass& other) {

      return (*this) = MyClass(other);
    }


In any version of the language it's better to decompose it so that you never implement any of these. You either have:

1. A class that's responsible for managing exactly one instance of a resource (e.g. FileDescriptor). This class defines whether or not a resource is copyable and/or movable.

2. A dynamically sized container that needs to offer certain kinds of guarantees.

3. A composite class that may contain zero or more instances of 1, 2, and/or 3.

The vast majority of classes you write are always going to be 3. Option 3 should NEVER specify move/copy constructors/assignment operators and instead should inherit that from whatever is being stored.

Don't do #2 unless you really know what you're doing. Prefer to use existing well-behaved containers from STL. If you need 3p ones, Folly and Boost are probably good ones. I don't know how ABSEIL fares around exception safety when exceptions are enabled so YMMV there if you're doing things in an exception context. It can be hard to properly make sure that you inherit the copyability / movability from the underlying type (& also noexcept inheritance).

For #1, KISS is a very good principle. Just make sure you always leave resources in a consistent state. Move helps here a lot but can be emulated with copy + swap pre-C++11. Here's an example of what a copyable and movable resource might look like:

    // Terrible idea - don't actually do this in practice.
    class AutoDupOnCopyFd {
    public:
      AutoDupOnCopyFd(const AutoDupOnCopyFd& copy): _fd(dup(copy._fd)) {
        if (_fd == -1) { throw ... }
      }
      AutoDupOnCopyFd(AutoDupOnCopyFd&& owned) noexcept: _fd(std::exchange(copy._fd, -1)) {}
      ~AutoDupOnCopyFd() noexcept { close(_fd); }

      AutoDupOnCopyFd& operator=(const AutoDupOnCopyFd& other) {
        // No need to check for assignment to self but you can if you think you're likely to have this happen
        // to avoid the syscall (copy to self is typically rare)
        AutoDupOnCopyFd copy(other);
        *this = *copy;
        return *this;
      }

      AutoDupOnCopyFd& operator=(AutoDupOnCopyFd&& other) noexcept {
        using std::swap;
        swap(_fd, other._fd);
        return *this;
      }
    private:
      int _fd;
    };
Now `AutoDupOnCopyFd` can be nested in any composite class & the composite class will automatically inherit the right set of copy / move semantics with correct exception safety. The rule of 0 really is a powerful concept. This breaks down if you start trying to make 1 class responsible for multiple resources. Don't do that. Use resource classes + containers/composite classes to do that.

You can also (ab)use unique_ptr for unique ownership although I still prefer explicitly named classes (no confusion with `->` vs `.`, easier to understand for the vast majority of coders).

It's a powerful concept to internalize to level up your C++ game but it's only applicable to C++'s ownership model. Rust went a more teachable path that doesn't have these foot guns.


> I don't know how ABSEIL fares around exception safety when exceptions are enabled so YMMV there if you're doing things in an exception context

OMG this is my time to shine. I wrote the library for testing Abseil containers for exception-safety. I don't work there anymore but I know they've implemented exception safety tests for a bunch of containers in there, and I remember when I was writing the library we even found a bug in GNU `std::optional`.

https://www.youtube.com/watch?v=XPzHNSUnTc4 is the talk I gave at cppCon about this.

As a side story, that talk was the first one in the morning and I showed up just before it started on account of my Uber driver missing the highway exit. I had planned to grab some food at the conference but didn't have time so that whole talk I was overstimulated on coffee on an empty stomach trying to keep my composure early in the morning.


I'm kind of surprised that ABSEIL even bothers with exception safety since Google builds everything without exceptions. Can you shed some light on why the team decided to invest in making it exception safe?


Because many of the OSS projects using Abseil do use exceptions and that code, as you pointed out, was completely untested with exceptions on.


Copy-and-swap/move is fun and all, but what they don't point out to you about it is that it prevents the base class operator= from being called when there's inheritance (because it's bypassing operator= entirely).

i.e. it might be fine at the app-level since you might know all your callers in that case, but you shouldn't do it in a library that others might want to use, or your code will behave erratically.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: