> it’s completely bonkers to expect the average user to build an overloaded callable object with recursive templates just to see if the thing they’re looking at holds an int or a string.
And now you get neither exhaustive checking nor type-safe unwrapping, at this point is there really a point to variants? You may as well be using the old enum+union.
(get_if at least nets you type-safe unwrapping similar to `if let` in Swift or Rust, though it returns a pointer rather than a reference)
You either want exhaustive checking or "just [want] to see if the thing they’re looking at holds an int or a string". OPs comment was about the latter and holds_alternative and get accomplishes that.
FWIW, std::get is type-safe in that you cannot specify a type outside of the variant types. It's safe at runtime in that it will throw std::bad_variant_access if the active object doesn't match the type.
And because of this, you think we may as well use enum+union? Even if you only plan on manually type switching on a variant, std::variant saves you from a lot of boilerplate.
Not the same as what the article is trying to accomplish, which is an exhaustive match (i.e., any unmatched value is guaranteed to fail at compile time).
For example, consider a parser that matches on tokens. If you add a new token, the match should fail, because you want to guarantee, at compile-time, that every possible case is handled.
This is one reason that the lack of sum types in Go is so painful, to the point that someone wrote a special library for it [1].
I'm familiar with the approach from other languages that the article tries to import into C++ before concluding that the designers should watch out for language envy (incidentally, Stroustrup wrote a paper on efficient type matches: http://stroustrup.com/OpenPatternMatching.pdf ). The author picks a particular approach and then shows that the approach isn't satisfactory. But the author did not go back to try another approach.
If the visitor approach is acceptable but has_alternative is not, get<> also accepts index values (as numeric template parameters, e.g., get<0>(v), get<1>(v), etc.) and variant has a method called index() to give a numeric value saying what the current type is. This is easy enough to use in a switch statement:
std::variant<int, double, std::string> v;
...
switch (v.index()) {
case 0:
std::printf("%d\n", std::get<0>(v));
break;
case 1:
std::printf("%f\n", std::get<1>(v));
break;
case 2:
std::puts(std::get<2>(v).c_str());
break;
}
In this case you can add a default: branch and throw an error for unhandled types, or you can hope that the compiler issues a warning about a missing case (index is constexpr, so it's possible for the compiler to know that you've missed something). It's not as good as a compile error, but it might be good enough.
While I think it's just a honest overlook by the author (C++17 is just really new), I still think it's funny that a huge portion of the entire article and by extension the argument against variants is rendered moot by RTFM.
I think the author's goal was quite obviously to have strong sum types with compile-time safety and branch resolution. In other words, to make it impossible to use the value stored in the variant in a manner inconsistent with its type because doing otherwise is a compile-time error.
The above parts of the standard library don't help in achieving this goal.
> The fact that we still handle dependencies in 2017 by literally copy-pasting files into each other with #include macros is obscene.
Historically, C++ used includes so that it could be compatible with C. In the future, modules can be used which avoid many of the problems with includes [0].
Is this implemented yet? The paper you linked is a draft. I'm glad C++ is doing this.
> Historically, C++ used includes so that it could be compatible with C.
I think it is more the case that C++ slowly splintered off C and never broke free completely of #include. Rust is also quite compatible with C without supporting anything like #include. It even has a module system!
You are talking about limited binary compatibility instead of source compatibility. These are two different things and can't be compared to one another.
I agree completely with everything in this article. But in addition, I think std::variant also misses the point of sum types in a big way.
Sum types don't just store values of different types. They store different states, with associated data. So, for instance, consider the following simplistic expression AST; how would you store it in a std::variant?
The C++17 equivalent would be something like the following (not tested):
using NumberExpr = int;
using VarExpr = std::string;
struct AddExpr;
using Expr = std::variant<NumberExpr, AddExpr, VarExpr>;
struct AddExpr {
std::unique_ptr<Expr> a;
std::unique_ptr<Expr> b;
}
Of course, this being C++, you need forward declarations and a firm grasp of the rules of incomplete types to be confident about declaring a simple AST type.
To completely address JoshTriplett's point, yes, you can just define another struct for the SubExpr variant to disambiguate it from the AddExpr case.
Requiring this kind of wrapping is awkward compared to e.g. Rust or Haskell's treatment of sum types, which unlike C++17 and std::visit both have powerful pattern matching features built into the language. Saying this as someone who writes C++ all day: std::visit and std::variant are weaksauce.
On the other hand, the C++ way gives you an actual type for each element of the sum; you can write a function which only takes AddExpr. The Rust way doesn't (yet).
Given that you have to define those types manually, I don't see why you couldn't do the same in Rust; it just doesn't force you to if you don't need it
That's rather disingenuous, because Haskell forces you to use `newtype` wrappers for lots of things you shouldn't need them for and don't need them for in C++.
Can you name an instance in Haskell where newtype is conceptually unnecessary but required by the language? In the sense that you may be able to derive the same set of logical guarantees that newtype gets you without using it, in principle.
> Sum types don't just store values of different types.
That's exactly what a tagged union does. std::variant is a tagged union. I was not aware of the name 'sum type' but it's supposed to be a synonim for a tagged union. Guess not, but std::variant is not meant as what you describe[1].
> That's exactly what a tagged union does. std::variant is a tagged union.
Not exactly. std::variant is one particular type of a tagged union, where the only discriminant is the type. You can also have a tagged union where the discriminant determines some semantic state, and multiple such states may store the same type of value. That's still a sum type, still a tagged union, and not something std::variant can do.
In it's terminology, std::variant is a 'discriminated union' instead of a 'sum type'. Making std::variant the latter would also have had disadvantages it seems.
Reading the article, it seems the committee just doesn't want to change the actual syntax of the language since C++11. The only reason I can think of is that it's easier for compiler vendors to update the STL and minor syntax changes rather than adding something like described in parent.
The reason why we have std::visit the way it is, is because that's the way it worked in Boost Variant, which is the basis for the proposal.
The reason why it's the basis, is because it's a time-tested, proven and stable solution that has been around since 2002.
The reason why it's so ugly, is because that's the best you could do in C++ back in 2002.
So, there's a perfectly rational explanation for all this - it's not "insane". It is unfortunate that they didn't come up with a better API that would make use of new language features, but it's not like someone deliberately set down to design the more convoluted older API just to confuse people.
Sum types can actually be implemented quite effectively using X-macros in both C and C++. In fact, I feel like they're simpler and more intuitive than this variant stuff.
Edit: Let's use the same example.
#define SETTINGS \
X(string, str) \
X(int, num) \
X(bool, b)
struct Setting {
union {
#define X(type, name) type name;
SETTINGS
#undef X
};
enum Type {
#define X(type, ...) t_ ## type,
SETTINGS
#undef X
};
Type tag;
};
Printing settings like in the example becomes this:
void printSettings(const Setting& s) {
switch(s.tag) {
case t_string: printf("A string: %s\n", s.str.c_str()); break;
case t_int: printf("An integer: %d\n", s.num); break;
case t_bool: printf("A boolean: %d\n", s.b); break;
}
}
We can also load more things into the x-macro, so it's possible to define the switch cases above just like in the structure definition. We could add a third parameter called full_name:
void printSettings(const Setting& s) {
switch(s.tag) {
#define X(type, name, full_name) \
case t_ ## type: std::cout << "A " full_name ":" << s.name << "\n";
#undef X
}
}
Disclaimer: I have not compiled or run any of this code.
Yes, which is why I would always use the second example for the print function.
Edit: In fact it is possible to handle the cases explicitly while enforcing types by having the X-macros call functions in the switch/case and declaring prototypes via the macros:
Half the way into the article, I felt like crying. Is it just me or is the standards committee actively trying to reduce the number of existing C++ programmers. I think we are better off with boost than learning this new stuff. I hope the people in standards committee will lose their C#/Java/<insert cool language> envy and be more selective in what they want to add to standards.
Which is why no one uses it. The good thing about boost is you get to evaluate features based on merit and be choosy. Putting these in the standard creates an expectation that developers be aware of how to use them.
I used it together with type() to read the variant's current type and act on that.
The std::variant has a bit nicer API compared to boost with e.g holds_alternative. I expect I would create a make_visitor wrapper myself or use an open source one if needed, but it wasn't needed.
Straight C is ridiculously hard to write at any scale with any sanity. Not to mention it requires re-inventing stuff that you're basically guaranteed to get wrong (such ref counting), which of course none of the libraries you want to use will support so you have to wrap that up in something else.
C++ is a massive improvement over straight C in pretty much every practical way. And best part is if you don't want to use awkward, heavyweight abstractions like std::variant then you can just not use them and be no worse off for it.
I used to think this until I worked on a large C project at scale.
C++ definitely has some improvements over C, but it definitely does not have a "massive improvement over straight C in pretty much every practical way". It's pretty bad that you have to avoid many parts of the language.
I've also noticed from various projects that C projects tend to have less code to grok than C++ projects while achieving the same thing. Who would've thought.
Because different languages have different goals, and most languages don't need the flexibility and power of many of C++'s features, and favor an easier language while sacrificing some of these extra language tools.
Or you could just skip the part after the author went "ugh, I have to write a visitor, that's like, so many characters to type." He's whining over a fairly minor point and jumping through way more hoops than necessary. The proposed "default" way to visit std::variant is honestly pretty clear and simple in comparison to most C++ code.
Why would you teach someone C++, a language that has a lot of problems because it started as C with some sugar on top and then against all odds went ahead and added a ton of things that are very un-C-ish while still supporting all the C-ishness it had (has).
Teach new languages to new programmers. Then when they have firm grasp on these concepts, and for some cruel trick of fate they have to do C++ development, then they can look this up.
There is about the same amount of runtime checks. In the auto-lambda+constexpr case the switch/if-cascade is inside visit and it is required to dispatch the correct type to the visitor.
I don't know why the author is so disparaging of if constexpr; I think it's going to push c++ metaprogramming in a direction that makes the more powerful aspects of the language easier to understand for beginners.
The struct solution is a design pattern for creating a sum type. The lambda solution is a factory for creating any sum type as long as the parameter in each lambda is unique.
Among other things, it's a design with weaker coupling.
Seems like a design pattern from Java which often are badly converted functional programming snippets into OOP. I once loved C++, now I can't force myself to read the code anymore... Please add some monads there to destroy my hope for humanity once for all!
I think maintainers of C++ have a case of "functional programming" envy.
They should have FP envy if their language doesn't have a simple and clean looking way of defining that a value is either a T or a U. Inheritance in the normal sense doesn't cut it because that defines an open (for extension) set of subtypes. You can't e.g compile time check that a switch/match has tested all cases.
If you want to make a closed set of subtypes that's usually possible only by writing a class hierarchy with private subtypes and an abstract outer type with factory methods for the inner types. That takes hundreds of lines for even just a couple of variants, and then there is still no compiler help for exhaustive switch/match.
Why should C++ as an imperative/OOP language ape everything possible in FP? It feels like when FP languages slap on OOP for a change. C++ wasn't designed with what we call FP these days in mind, so unless it's going to change drastically, there will always be "smell".
FP doesn't have many things C++/D etc. have. Should we insist on every FP to have assembly-level access, custom memory managers etc. because they are cool as well?
It doesn't need to ape everything possible in every FP language. Just sum types. To me, "closed type hierarchies" are a pretty fundamental thing in programming regardless of how they are represented (OO style subclasses or otherwise), perhaps more important and fundamental than other FP things such as closures/lambdas (which Java, C# and C++ all adopted).
It's already possible to make the class hierarchy in most/all OO languages but it takes 150 lines for what can be expressed in 5 with some help from the language. Most importantly, without lang support the compiler can't validate exhaustive matching.
Sum types are not "everything possible in FP". I'm part of the group that views sums as an extremely basic language feature and I give an extreme side-eye to any language that still doesn't have them in the current year.
> I'm part of the group that views sums as an extremely basic language feature
More importantly it's the second category of type relations. There are product types and there are sum types, if you only have product types, you're missing an entire half of expressible type relations/compositions.
It's not a matter of functional versus imperative, there's nothing inherently functional about sum types.
All OO languages have sum types in a way, but it's just not a very good way. If there is no data or only integer data it's usually called "enum" and if it's a more complex data structure then it's a class hierarchy. It's just really clumsy in most OO languages to type out "type PaymentMethod = Cash | CreditCard(CardDetails d) | Invoice(Address a)"
In OO it's "abstract class PaymentMethod" and then another 100 lines, plus probably a horrible visitor pattern (because you don't want the payment method handling the payment). This to just define the type, no actual payment logic.
Huge amounts of boilerplate, and very little power if the compiler can't ensure exhaustive matching.
Discriminated union is low-level pattern from non-OO languages (it is common pattern in C, Pascal has syntax for this and IIRC unions are always discriminated in "standard" Pascal)
In object oriented languages this pattern is mostly unnecessary, because inheritance is usually better solution for the same problem. Only reason to use something like this in C++ is (maybe even only perceived) efficiency gained by removing level of pointer indirection.
The sum type, which the author is talking about, is not the same as "discriminated union" pattern, the latter being the crude implementation of the former.
It really comes from functional languages like ML, and is extremely useful and convenient when combined with pattern matching. In fact, attempts to emulate it in OOP using interfaces and visitor pattern tend to bring a lot of boilerplate, and obscure the actual logic, which is exactly what the author is complaining about.
There's definitely more reason to use it than just "efficiency gain", which is why many languages introduced in the last decade have it built in (e.g. Rust, Scala or Swift).
> The sum type, which the author is talking about, is not the same as "discriminated union" pattern, the latter being the crude implementation of the former.
Could you say something about the difference between the two? I'm not aware what it is.
IIRC discriminated unions are plain C(++) unions with an added type tag, like at the beginning of the article. Brittle, prone to breakage when refactoring, etc. when compared to real sum types.
I'm really curious, I write a lot of python, and find myself wishing for sum types all the time. How do you see inheritance filling the same space? I'd love to be able to do something like:
data Choices = Good | Bad | Ugly
like I can do in haskell, where a value of type `Choices` can only ever be one of those options. But there doesn't seem to be any reasonable approximation in python.
The OOP pattern dfox is referring to would look something like this:
class AbstractChoice:
# defines all common operations on choices
# plus potentially some useful stuff on top of those
pass
class Good(AbstractChoice):
# conatins the code specific to good things
pass
class Bad(AbstractChoice):
pass
class Ugly(AbstractChoice):
pass
This isn't a straight-on replacement though. It's especially bad when you want to separate your concerns not along the good-bad-ugly-axis but something different (which is when you'd e.g. go on to use mixins, or that ugly visitor pattern we've seen in the OP).
(Since I'm asking a question, you know there is a gotcha).
Answer: "abc" is const char*, which can be converted to bool and is picked because the way variant is designed.
So, if you really hate making a struct each time, you drop make_visitor into the project's util.h and never worry about it again. I don't see the big deal.
Besides, actually making a struct everytime is not that bad in practice.
I also understand (and strongly support) the committee's desire not to introduce yet another language construct when a library solution can be worked out, given the horrible beast c++[11,14,17,20] has already become.
Funny enough, std::visit was one of the features of c++17 that I was looking forward most to. Something that I don't entirely understand, though, and the author brings up is the lack of a function like make_visitor. Does anyone understand why or was there a late addition of a function that I'm not aware of? And, yes, std::visit is nonideal, but I do think it to be a much better option to using double dispatch and the visitor pattern, which is what we had to do before.
More generally, I write numerical software and I do with there was a better option for writing this software outside of C++, but I don't see one right now. Specifically, C++ gives us direct access to the c-api of other languages and some pretty powerful tools to handle that. As such, if we want our software to work across multiple languages like Python, MATLAB/Octave, or a variety of other languages, C++ appears to be the best fit. Yes, it's possible to hook something like a Python code to MATLAB/Octave, but it's hard because Python and MATLAB/Octave handle memory in different ways. For example, they differ on how and when objects are collected by the garbage collector, so it makes it hard to use a Python object directly in MATLAB/Octave. In C++, we have enough tools to handle hooking C++ objects and items to other languages. Certainly, it's a pain, but I contend it's easier than to hook two other languages together through the c-api, but I find this more difficult to manage and we now have a bunch of additional code to maintain as well. As such, as many disadvantages as the language has, I appreciate new features like std::visit because it means that I can write easier code for my algorithms and still be able to hook to others.
> Something that I don't entirely understand, though, and the author brings up is the lack of a function like make_visitor.
Finite amount of committee time. It is better to ship working pieces and add missing pieces later than wait forever for the perfect solution. std::variant had been in bike-shed-mode for a very long time due to the never-empty-guarantee saga. At soon as some sort of consensus was reached, it was decided decided to ship what was ready.
I've been programming C++ for 20 years, and since C++11 the only feature I've seen that's worth the cognitive load is auto. Everything else just seems overly complicated. When I compare it to how easy things are in Python, I cry.
static_assert, nullptr, constexpr, initialization lists, for-each loops, default & delete for class methods...
and then there's things that are wonderful to use even if it's terrifying to look at how it's implemented like std::forward which is used with vector.emplace_back.
there's also simple things like vector<unique_ptr<X>> being legal C++11 syntax instead of an illegal right shift operator.
What is 'jobs' in your pseudo-code? Can it be a user-defined type? Does the author of that class need to explicitly state that their type meets some trait? Is 'Any' part of the type, a trait, or the language?
What humanrebar wrote is an algorithm that will work with any range of any type that has a 'failed()' member function.
Then, lambas are verbose.
I would like to have a simpler syntax like for simple lambas (no capture, single expression in the lamba body). job could be automatically typed with const auto&, or you could write it yourself if you wish.
job => return job.failed();
And a the current one, which is more verbose, for more complex lambas (capture, several expressions in the body)
This comment is nonresponsive to the parent comment. The complaint (which I happen to agree with) is that both iterators/ranges and lambda syntax are too verbose in C++.
Sorry, I misremembered the timeline. I thought they were part of C++03, but they're actually C++11. I was using shared_ptr when it was still part of Boost.
the new c++ is great in that it is evolving fast, but I feel it is extremely hard to read, the readability of the language is diminishing by each newer version.
Javascript and other dynamic languages are leaning towards type safety and other static checking tools. C++ and other compiled languages have strong types , but need and std::variant (and similar constructs) to add flexibility and boost productivity. The Yin and the Yang, looking to balance themselves into perfect programmer bliss.
Here be dragons, though, since we must always remember to:
* Update tag whenever assigning a new value.
* Only retrieve the correct type from the union (according to tag).
* Call constructors and destructors at appropriate times for all non-trivial types. (string is the only one here, but you could imagine similar scenarios with others.)
"""
...the last paragraph is the part that will almost surely trip a C++ newcomer up if they tried this out. The first to are (valid in my experience) complaints about maintenance mistakes that result in compiling programs but runtime errors.
No doubt that std::visit appears to be a mess but why not call std::get<a_type>(a_variant) and trap bad_variant_access exceptions? It would still be a bit annoying to exception-wrap each attempt but it seems cleaner than any of the std::visit alternatives.
Is the author complaining that the process is too complicated? This is typical of the c++ world, other languages make it easy but c++ isn't other languages.
"Sum types". Hey, let's give an old concept a new name. Those are called discriminated variant types, and they first appeared in Pascal. They're a straightforward concept, and can be implemented easily at the language level.
The big problem with C++ is that the template fanatics took over. Templates are a crappy programming language - bad syntax, confusing semantics, and tough debugging.
But there's no way to stop people from extending the language via templates. Hence Boost, and "you are not supposed to understand this" templates.
(I fear that Rust is going down the same rathole.)
This blog post is not to be taken seriously in my opinion. For one thing, `printf` is obsolete in C++. This is how you should output values in a tagged union:
If you also want to print the name of the type of the variable, along with the value itself, well, C++ doesn't have reflection (yet). So you're going to have to write a function that takes a value and returns its type as a string, which you can already do using the typeid operator. This is basically implementing reflection yourself, it is cumberstone but doesn't require advanced template programming.
There no need for any of the madness with explicit types inside the visitor lambda, which is deemed as neccesary by the author.
Edit: here's another way to map types to strings. So no templates necessary at all for this entire problem.
>This blog post is not to be taken seriously in my opinion. For one thing, `printf` is obsolete in C++. This is how you should output values in a tagged union:
This is so irrelevant as to the point of the article that it is funny.
The printing was only a simple example. A less contrived example would be to evaluate an AST. E.g. you would want the '+' operator to do different things for strings and numbers.
Also, your visit function does not do exactly the same thing as the author's example. The author's example also prints the type's name. How would you do that without making the visitor cases explicit?
Yes, this would work. I can see that variant may prove useful for the case that you present, but it is much less elegant for the usecase which the author tries to present.
This works only so long as you can pass the problem on to another function that handles arbitrary types (like `std::cout`). If you want to handle it yourself then you have to use object overloads, templates, etc, etc.
So basically you're saying that if you want different behavior based the type of a variable passed to a function, you're going to have to write that behavior...
Then the answer is yes. We haven't advanced to the point where the compiler can guess that you want. If you can give me an actual example of when it would be an insurmountable task to do so, please let me know.
>So basically you're saying that if you want different behavior based the type of a variable passed to a function, you're going to have to write that behavior... Then the answer is yes.
That, and that it should be that hard, and the provided features for doing so should be better, is the entire point of the article.
>We haven't advanced to the point where the compiler can guess that you want.
That's not some case of magic compilers. This is just bad design.
>of when it would be an insurmountable task to do so, please let me know.
Whoooosh. The whole point is not that it is insurmountable, but that it's much worse than it should be.
The blog post is saying that writing that behaviour is more complex than it needs to be. The author gave an example of pattern matching syntax that would greatly simplify things.
why would it allocate ? creating objects does not trigger allocations most of the time. The only stack space used is the internal state of the functor used, which is a lot of time empty (and won't even exist anymore by the time the compiler has optimized your code).
Now I know that there's a stack allocation for the functor. Not all platforms have Sufficiently Smart Compilers; I've got code relying on SDCC and TCC, for instance. :)
That's what I'm getting at, there's this whole new suite of functionality that is sort of opaque to me, whereas I'm used to having a grasp of what's going on at-a-glance when working with C and C++98. I'll just have to spend a weekend working with it.
> Not all platforms have Sufficiently Smart Compilers; I've got code relying on SDCC and TCC, for instance. :)
I'm pretty sure that it's not a matter of compiler brightness; the C and C++ standards are both pretty explicit about when the automatic and dynamic storages are used. If anything, I guess that it would be harder for compilers to allocate such things on the heap.
I checked and tcc doesn't allocate anything for instance for this code; neither does gcc 4.4 in c++98 mode :
int main()
{
struct foo {
int a, b, c;
char x[3000];
} f;
}
printf is a good, easy example of a case where you need to have different logic depending on which variant you are. It's not an endorsement of printf qua printf over cout qua cout. cout only works here because the standard library has already overloaded cout for each variant, as the author ends up doing. If you were doing anything else, you'd need either the explicit overloading or the if-constexpr thing.
The example from the Rust book (which, full disclosure, I wrote because I <3 tagged enums and pattern-matching and the previous examples weren't that great) is probably a better one: https://doc.rust-lang.org/book/first-edition/enums.html
which (modulo any errors from me not actually testing this) is perfectly valid Rust code, that's entirely readable even if you only know C++ and not Rust. As far as I know, you can't write anything anywhere as straightforward as this in C++. You'd need to define at least two new classes, probably four, for the four variants, plus another function that's overloaded on the four classes to handle the behaviors; you can't have the behavior be inline in your existing function, as above. (The way I wrote this, it's just using the global print function, but it'd rapidly get messy if you needed to pass a reference to a console object or whatever.) Alternatively, you would in fact need the constexpr trickery the article suggests, which would let you write it inline with some lambdas. None of this is needed in a language with language-level support for tagged unions and pattern matching (of which Rust is hardly the only one - please don't take this as advocacy of Rust in particular); you can just write normal control structures as above.
Alternatively, none of this is needed in a language with dynamic typing; you'd just do
while True:
msg = get_msg()
if msg.type == "Quit":
return
elif msg.type == "ChangeColor":
print "027[38;2;{};{};{}m".format(msg.r, msg.g, msg.b)
...
but presumably you're using C++ because you want to be able to do this without that level of dynamism (which puts some unenviable lower bounds on efficiency).
> it’s completely bonkers to expect the average user to build an overloaded callable object with recursive templates just to see if the thing they’re looking at holds an int or a string.
You don't have to: http://en.cppreference.com/w/cpp/utility/variant/holds_alter... (and http://en.cppreference.com/w/cpp/utility/variant/get to access the value).