More

ranran876 · on Oct 3, 2014

> When I don't exercise (for me that means also practicing Ashtanga Yoga) for like a week because of external reasons, I experienced a deep slash of depression

Yep, that's the flip side. It helps fight depression, but it also has an addictive side to it. Fortunately it's really not as time consuming as you'd think, but taking a whole week off is almost out of the question

ranran876 · on Oct 1, 2014

Are you coding currently in C++11/14 ? It doesn't feel "bandaid-y" at all.

ranran876 · on Oct 1, 2014

That sounds very similar to LabView (aka. G)... Have you ever tried programming something complicated in LabView? I'd almost rather write assembly.

Professional programming is about managing complexity. It's about putting thousands of man-hours into one single project with a dozen of different people without anyone's head exploding.

Light-table could have potentially been a huge help in the constant battle with managing complexity.. It's an area that has need some serious development (virtually nothing substantial has been added to IDEs in the past decade). But instead you're going to create a beginner friendly monster.

This is the opposite of progress

ibdknox · on Oct 1, 2014

I was suggesting that it's actually not like labview at all :p

The model we have is significantly better than what we ever showed with Light Table. For example, being able to organize "code" by the actual functionality it's related to instead of files is something that happens by default in our world. You can actually create queries to show any view of the rules that exist in our world that you want (show me everything related to this button, that has been changed in the past month..)

Our model is entirely about preventing people's heads from exploding, taming complexity through a very simple set of primitives that seamlessly work together.

orbifold · on Oct 1, 2014

I don't know, I've only used Labview a couple of times, it had really nice interfaces to temperature sensors and some other physical things, you were able to throw together a quick ui monitoring a number of sensors, had a start and stop button and tools for data export. We were able to pick it up in a few hours. Programming the experiment took a couple of hours too, obviously it was rather trivial but I doubt we would have had much luck interfacing with all those different sensors in a C program without major headache.

ranran876 · on Sept 26, 2014

This really seems like something that should be solved by the IDE.

In VS13 the R-Click -> Go to Definition interface is OKay... but it definitely could be better

corysama · on Sept 26, 2014

Light Table's code bubbles look pretty nice.

http://www.chris-granger.com/images/lightable/light-full.png

ranran876 · on Sept 26, 2014

and C++11 btw ...

ranran876 · on Sept 26, 2014

I might be alone on this, but whenever I read things by John Carmack I get a vague sense that he doesn't really get object oriented programming. He always has a lot of interesting things to say, but it also kinda reads like a C guy trying to code in C++. I'm glad his thinking keeps evolving and he's not dogmatic about anything. I'd honestly love to hear his thoughts on C++11

"The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it."

That's the equivalent of saying "the faster you drive the safer you are b/c you're spending less time in danger"

You'll just end up with larger monster functions that are harder to manage. "Method C" will always be a disaster for code organization b/c your commented off "MinorFunctions" will start to bleed into each other when the interface isn't well defined.

" For instance, having one check in the player think code for health <= 0 && !killed is almost certain to spawn less bugs than having KillPlayer() called in 20 different places"

I don't completely get his example, but I see what he's saying about state and bugs that arise from that. You call a method 20 times and it has an non obvious assumption about state that can crop up at a later point - and it can be hard to track down. However the flip side is that when you do track it down, you will fix several bugs you didn't even know about.

The alternative of rewriting or reengineering the same solution each time is simply awful and you'll screw up way more often

phkahler · on Sept 26, 2014

>> I might be alone on this, but whenever I read things by John Carmack I get a vague sense that he doesn't really get object oriented programming.

I'm starting to think object oriented programming is a bit over rated. It's hard to express why exactly, but I'm finding plain functions that operate on data can be clearer, less complicated, and more efficient than methods. Blasphemous as it may seem, a switch statement does the equivalent of simple polymorphism and can be kept inline.

jblow · on Sept 26, 2014

Many game programmers decided long ago that object-orientedism is snake oil, because that kind of abstraction does not cleanly factor a lot of the big problems we have.

There isn't anything close to unanimous agreement, but the dominant view is that something like single inheritance is a useful tool to have in your language. But all the high-end OO philosophy stuff is flat-out held in distaste by the majority of high-end engine programmers. (In many cases because they bought into it and tried it and it made a big mess.)

aaronetz · on Sept 26, 2014

As a fellow game developer, I have to agree. I find that inheritance is a form of abstraction which sounds nice on paper and may work well within some domains, but in large and diverse code bases (like in games), it makes code harder to reason about and harder to safely modify (e.g. changing a base class can have a lot of subtle effects on subclasses that are hard to detect). The same goes for operator overloading, implicit constructors... Basically almost anything implicit that is done by the compiler for you and isn't immediately obvious from reading the code.

eru · on Sept 29, 2014

That's suspicion of snake oil paradigm is why it's interesting that game developers seem to be much more open to functional programming. Compare egTim Sweeney's "The next mainstream programming language" (https://www.st.cs.uni-saarland.de/edu/seminare/2005/advanced...)

ANTSANTS · on Sept 27, 2014

>Blasphemous as it may seem, a switch statement does the equivalent of simple polymorphism and can be kept inline.

In the statically compiled languages that most people think of when they hear "OO" (C++ and Java), yeah, switch statements vs. virtual methods (performance differences aside) is basically a matter of code style (do you want to organize by function/method, or by type/object?)

However, the original proponents of OO intended it to be used in dynamically compiled languages where it could be used as a way to implement open/extensible systems. For instance, if a game entity has true update, animate, etc. methods, then anyone can implement new entities at run time; level designers can create one-off entities for certain levels, modders can pervasively modify behaviors without needing the full source code, programmers trying to debug some code can easily update methods while the game is still running, etc. You can get a similar effect in C or C++ with dynamic linking (Quake 2 did this for some large subsystems), but it's a pain and kinda breaks down at fine (entity-level) granularity.

The other, "dual" (I think I'm using that word correctly?) approach famously used by emacs is to create hooks that users can register their own functions with, and extend the program that way. Like switch statements, it basically amounts to storing functions at the call site instead of inside the object, except with an extensible data structure rather than burning everything directly into the machine code.

Obviously you can't really take advantage of any of this if you're writing some state of the art hyper-optimized rendering code or whatever like Carmack, I'm just saying that OO's defining characteristics make a lot more sense when you drift away from C++ back to its early days at Xerox PARC.

pjmlp · on Sept 26, 2014

As Wirth puts it, Algorithms + Data Structures = Programs

What OOP nicely brings to the table is polymorphism and type extension. Two things not doable just with modules.

Although generics help with static polymorphism.

The problem was that the IT world went overboard with Java and C#, influenced by Smalltalk, Eiffel and other pure OO languages.

Along the way, the world forgot about the other programming languages that offered both modules and objects.

> Blasphemous as it may seem, a switch statement does the equivalent of simple polymorphism and can be kept inline.

Except it is not extendable.

ufo · on Sept 26, 2014

Switch statements are extensible in that you can add extra switch statements to your program without needing to go back and add a method to every class you coded, spread over a dozen different files. Its the old ExpressionProblem tradeoff.

ANTSANTS · on Sept 27, 2014

All code is extensible if you have the source code and can recompile the whole thing then restart the program. I think pjmlp meant extensible in the "extensible at runtime" sense.

ufo · on Sept 27, 2014

switch statements and method calls are kind of duals of each other. One makes it easy to add new classes but fixes the set of methods and the other makes it easy to add new methods but fixes the set of classes. It doesn't have to do with runtime.

http://c2.com/cgi/wiki?ExpressionProblem

ANTSANTS · on Sept 27, 2014

Yeah, I said as much (even calling them duals) in a sibling comment.

https://news.ycombinator.com/item?id=8375910

Method calls don't need to be fixed either. Just because C++ stores virtual methods in a fixed-sized table doesn't mean Lua/Javascript/etc can't store them in hash tables. And a list of hooks is sort of like an extensible switch statement, but bare switch statements like you were describing obviously don't have that kind of runtime flexibility.

sitkack · on Sept 27, 2014

If you get tired of the switch statement, pattern matching is the functional dual of polymorphic dispatch.

eru · on Sept 29, 2014

Even though, pattern matching (and algebraic datatypes) would work just as well in an imperative language as in a function setting.

(Not sure, whether you'd need garbage collection to make pattern matching really useful, though.

sitkack · on Sept 29, 2014

Pattern matching is welcomed everywhere, it saves conditionals and keeps the code clean.

pjmlp · on Sept 27, 2014

If you have control the code.

wtetzner · on Sept 27, 2014

No, you can add new functions that switch over the different types without having control over the code. That way you don't have to add the same method to each of the classes.

It's a different dimension of extensibility.

kgabis · on Sept 27, 2014

> Except it is not extendable.

Function pointers are a good way to achieve extensibility in such cases.

pjmlp · on Sept 27, 2014

You are basically doing a VMT implementation by hand.

I rather let the compiler do the work for me.

kgabis · on Sept 27, 2014

No, using function pointers does not necessarily mean implementing a vtable.

pjmlp · on Sept 28, 2014

No, but it feels like it.

Oberon and Modula-3 only provide record extensions for OOP, with methods being done via function pointers.

In case you aren't familiar with these languages, here is some info.

http://www.inf.ethz.ch/personal/wirth/ProgInOberon.pdf (Chapter 23)

http://en.wikipedia.org/wiki/Modula-3#Object_Oriented

In Oberon's case, which was lucky to have survived longer at ETHZ than Modula-3 did at DEC/Olivetti, all successors (Oberon-2, Active Oberon, Component Pascal, Zonnon) ended up adding support for method declarations.

krylon · on Sept 26, 2014

For quite a while, there was significant buzz around OOP, and some people did, in fact, overrate it. But I get the impression that over the last couple of years, more and more people have begun to realize that OOP is not the solution to every problem and started looking in new directions (such as functional programming).

I think this why Go has become so popular. It deliberately is not object-oriented in the traditional sense, yet it gives you most of the advantages of OOP (except for people who are into deep and intricate inheritance hierarchies, I guess). (I don't know how many people are actually using it, but now that I think of it, the same could be said of Erlang - the language itself does not offer any facilities for OOP, but in a way Erlang is way OOP, if you think of processes as objects that send and respond to messages.)

So I think there is nothing blasphemous about your statement (in fact, Go allows you to switch on the type of a value).

(I am not saying that OOP is bad - there are plenty of problems for which OOP-based solutions are pretty natural, and I will use them happily in those cases; but I get the feeling that at some point people got a bit carried away by the feeling that OOP is the solution to every problem and then got burnt when reality asserted itself. The best example is probably Java.)

bane · on Sept 27, 2014

I've found the rise and fall OOP to line-up well with various trends in ontologies and semantic-like KM tools (after all OOP is basically designing an ontological model you code inside of). Outside of various performance issues, the idea that you can have one master model of your problem to solve everything is an idea that hopefully seems to be running out of steam.

In the area I've worked in, I've seen numerous semantic systems come and go, all built around essentially one giant model of the slice of the world it's supposed to represent, and the projects all end up dead after a couple years because:

a) the model does a poor job representing the world

b) nobody seems to have a use-case that lines up perfectly with the model (everybody needs some slightly different version of the model to work with)

c) attempts to rectify b by just adding in more and more to the model just leaves you with an unusable messy model.

More recent systems seems to be working at a meta-model level, with fewer abstract concepts and links, rather than getting down into the weeds with such specificity, and letting people muck around in the details as they see fit. But lots of the aggregate knowledge that large-scale semantic systems are supposed to offer gets lost in this kind of approach.

I think OOP at its heart is just another case of this -- it's managed to turn a programming problem into an ontology problem. You can define great models for your software that mostly make sense, but then the promise of code-reuse means your suddenly trying to shoehorn in a model meant for a different purpose into this new project. The result is code that either tries to ignore "good" OOP practices to just get the damn job done, or over specified models that end up so complicated nobody can really work with them and introduce unbelievable organizational and software overhead.

It's not to say that OOP and other semantic modeling approach don't have merit. They're very useful tools. I think the answer might be separate models on the same problem/data, each representing a facet on the problem. But I haven't quite gotten the impression that anybody else has arrived at this in industry and are instead just going for higher levels of abstraction or dumping the models all together.

Again, OOP manages to turn programming problems into ontology problems -- which is hardly a field that's well understood, while the goal is and always has been to turn programming problems into engineering problems -- which are much more readily understood.

ranran876 · on Sept 26, 2014

You clearly don't work on a large code base =)

Most programming concepts are really about code organization and not expressiveness or the ability to express an algorithm clearly.

Object oriented programming only really starts to make sense when you are working on something that will take thousands of man-hours. If you are working alone, or on a small project is can be completely irrelevant.

The work flow you are describing is what MATLAB guys do. It's an absolute nightmare once the project gets too large. It is however very fast an flexible for prototyping.

Silhouette · on Sept 27, 2014

You clearly don't work on a large code base =)

Whatever its other pros and cons, I find OO style tends to result in significantly larger code bases.

Most programming concepts are really about code organization and not expressiveness or the ability to express an algorithm clearly.

You imply a dichotomy where none exists. For non-trivial algorithms, the ability to express them clearly and the ability to organise code at larger scales are very much related.

weavejester · on Sept 27, 2014

Even on large projects, I think there are often better ways of managing complexity. A lot the reasons for encapsulating internal state disappear if that state is immutable.

munchbunny · on Sept 27, 2014

I would say that OOP makes a lot of sense on large code bases, but that it's also very easy (and dangerous) to get overzealous with object inheritance, interfaces, abstract base classes, etc.

wtetzner · on Sept 27, 2014

I honestly find that a good module system (like OCaml's) is much better for organizing code than objects.

EpicEng · on Sept 29, 2014

You have clearly never worked on a large, non-OO code base designed by competent engineers.

notastartup · on Sept 27, 2014

> Most programming concepts are really about code organization and not expressiveness or the ability to express an algorithm clearly.

You could also separate your functions in different scripts with naming that is closely related. You've pretty much achieved code organization without the hidden scaffold that comes with OO codebase, only a chosen few with ridiculously large salary know about and newer developers largely having to pretend to praise with terminologies straight from CS text book because their monthly check comes from it.

Why do we need thousand different ways to write a simple CRUD web application in a language? Obviously OOP hasn't really done what it is advertised which is introduce a code organization to it's fully efficient state any better than functional coding.

If Java was supposedly so great with it's OOP as a core feature,where as humans we are supposed to model the extremely complex systems of reality into some fictional objects in our brain, where is Sun Microsystems now? It ended up as snake oil for a lucrative proprietary enterprise software company which is aimed at selling simple CRUD apps pleasing the business/government crowd. Have you seen the range of Oracle's suite of crap? It's literally insane, you have to pay to basically learn how to reinvent the wheel in their own terms and be pay a percentage back to them for speaking their language, as if landing business deals isn't hard enough already. Absolute garbage Java turned out to be. Even Android pisses me off. I used to hate on Objective C but I applaud Swift, there's no such innovation taking place because the JVM and Java is built entirely on a failing founding software philosophy, that the whole world is some simple interaction of objects interacting with each other, not it is not, there's quantum mechanics in play, with myriads of atoms that end up interacting with each other in a chaotic fashion that gives rise to some pattern our brain is supposedly good at finding.

Throw sand on a piece of blank paper, people will claim they see Jesus, and sell it on ebay.

imanaccount247 · on Sept 27, 2014

Big things like say, the linux kernel? Which is not OO and is better off for it?

FLUX-YOU · on Sept 27, 2014

Ultimately our biases towards where the paradigms belong are a result of how the history has developed so far.

But hopefully we've learned that the guy selling OOP as the answer to everything is full of shit

zo1 · on Sept 27, 2014

"But hopefully we've learned that the guy selling OOP as the answer to everything is full of shit"

Replace OOP in your statement with "anything" and I'd say you're spot-on.

rgoddard · on Sept 26, 2014

   I agree. The simplicity comes from the fact that you are focusing on different aspects at different times. I find that I will start off with defining my data structure and only focusing on the data structure. What information do I need, what is the best way to organize the data. Those sorts of issues. Once I have the data structure then I focus on what I want to do with it. This may result in some functions attached to the data structure using the object oriented features and sometimes the functions live apart from the data structure. The benefit comes from mentally decoupling the data from the functions.

phkahler · on Sept 26, 2014

>> I don't completely get his example

To use Minecraft as an example, a player may die from falling from too high, drowning, getting attacked by a monster. If killPlayer() is called serparately for each of those cases, he asserts that it may cause bugs due to differing context or sequencing relative to other parts of the code. If OTOH you just decrement player health in each of those places and then check for health<=0 at only one place, you eliminate that class of bugs.

protonfish · on Sept 26, 2014

I am working on the most close-to-finished computer game I have written yet, and I have a bug caused by this exact thing! (I am learning a lot about what not to do as I go.) I plan on refactoring it to the check-once-per-loop style this weekend.

ranran876 · on Sept 27, 2014

That makes sense - though most of the time if a method call only makes sense give a particular state, it's generally set to be protected. You can still call it with the wrong state from within the same class, but I can't honestly think of a case of that happening in my work.. You generally are familiar with the workings of the class you are currently touching. If you aren't able to do that practically, then that generally means your class is simply too large.

Strilanc · on Sept 26, 2014

A "multiple kill calls" bug is your typical unexpected violation of ever-mounting implicit ordering constraints. For example, you might write:

    while true:
      ...
      if (!wasDead && dead) startFadeOut()
      ...
    
      wasDead = dead
      doPhysics()

and then months later someone adds fall damage to the physics engine, and suddenly there's a way to die where the screen doesn't fade out.

Tossrock · on Sept 26, 2014

Wait, like, Power Towers Strilanc? I love your work :) In fact I think I did a competition with some friends to get a high score in the map credits for a while...

Strilanc · on Sept 26, 2014

Story checks out [1].

Yeah, that's me. Wc3 mapping was fun times for sure.

1: https://github.com/Strilanc/Wc3PowerTowers/blob/3b93a83cf63f...

Tossrock · on Sept 29, 2014

Pretty crazy, all the things the WC3 scene produced. Like, multimillion dollar gaming sub-genres (tower defense, MOBAs, etc). Glad to see you're still making games!

dkarl · on Sept 26, 2014

I have found myself writing "Method C" code in cases where factoring the code into methods obscures rather than simplifies. I think Carmack sums up the reason pretty well: "The whole point of modularity is to hide details, while I am advocating increased awareness of details." I ask myself, can I factor methods out of this code (a la Method A or Method B) in such a way that the code can be understood without reading the implementations of the smaller methods? If not, then the complexity is in a sense irreducible, and splitting the code into chunks just forces other people (and eventually myself) to jump around and try to knit the pieces together mentally, when it would actually be easier to read the code in one piece.

Another way to put it is that Method C is the least bad solution when factoring fails. I had a conflict with a coworker several months ago over a difficult piece of functionality that I had implemented Method C style. It was giving him headaches and he complained incessantly about the fact that the code was written in a linear "non-factored" style. I tried to explain to him that the problem was simply that hard, and the code organization wasn't making it worse, but was rather making the best of a bad situation. (Basically, if he thought the code was hard to understand, then he obviously hadn't tried to understand the problem it was solving!) He ignored me and refactored the code Method B style. A month later he was still struggling (because it was a truly complex problem) and he called me over to help him out. The code was now unfamiliar to me, so I'd point at a method call and say, "What does this method do?" "Uh... let's see. <click>" "What does that method it's calling do?" "Hold on. <click>" And so on, all the way down the call chain.

The refactored code had become "easy to read" in the sense that the methods were short, but it also become impossible to read in the sense that reading a method didn't give you any useful information unless you went on to read all the code in all the methods it called. We ended up reading the code exactly as we would have read Method C code, except with a lot of clicking and no visual continuity or context. Abstraction didn't protect us from the details; it just made it harder to see how they fit together into the whole.

Guvante · on Sept 26, 2014

> The alternative of rewriting or reengineering the same solution each time is simply awful and you'll screw up way more often

He might not have communicated it completely correctly, but I believe he wasn't advocating for getting rid of functions to reduce redundancy.

He instead was advocating getting rid of functions that simply provide documentation of the process, and instead find a way to inline those functions clearly.

> However the flip side is that when you do track it down, you will fix several bugs you didn't even know about.

I think he is saying a class of bugs is avoided. For instance if I do X, Y and Z where all are only ran when the player is alive and Y might kill the player, leaving the player alive avoids a bug in Z if it assumes that the player is alive.

kragen · on Sept 28, 2014

> whenever I read things by John Carmack I get a vague sense that he doesn't really get object oriented programming

Can you share a bit about your background here? In the absence of more context, to me, this reads sort of like a guy who plays football on weekends saying that Lionel Messi "doesn't really get" football.

manoDev · on Sept 26, 2014

    > "The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it."
    > That's the equivalent of saying "the faster you drive the safer you are b/c you're spending less time in danger"

What I believe he means is that functions calls at different places can be a source of trouble when you're not side-effect free.

archagon · on Sept 26, 2014

For some reason, a lot of older game/graphics programmers seem to feel the same way about OO programming. I don't know if it's force of habit or experience on their part, but I try to keep it in the back of my head nowadays.

imanaccount247 · on Sept 27, 2014

>I might be alone on this, but whenever I read things by John Carmack I get a vague sense that he doesn't really get object oriented programming.

I get the impression that he understands it quite well, which is why he avoids it.

ranran876 · on Sept 24, 2014

Both statements ring true... what's your point?

ranran876 · on Sept 18, 2014

I can't think of anything from Microsoft Research that has translated into an actual product (other than maybe the Kinect.. though I think a lot of that IP was bought). They do some neat stuff I guess, but it seems like a giant money sink. Can anyone prove my wrong?

As an aside, I had a professor from the MS Research lab in Santa Barbara who was a complete moron. Seemed like he was some kind of middle manager at MS and then went to Research for an early semi-retirement

kvb · on Sept 18, 2014

Tons of stuff has been productized in various forms. .NET generics, for instance, were initially developed in MSR Cambridge. MSR research has contributed to speech recognition, search technologies in Bing, Excel's Flash Fill, SQL Server's Hekaton in-memory architecture, and many other products.

Someone1234 · on Sept 18, 2014

Those and thousands more listed here:

https://research.microsoft.com/en-US/about/techtransfer/defa...

Just click the years on the sidebar.

gradstudent · on Sept 18, 2014

There was a bunch of great Algorithmics research coming from Andrew Goldberg's group at MSR SV. I expect the fruits of their labour are implemented in Bing Maps.

ranran876 · on Sept 15, 2014

I'd definitely pick C++11 unless you need to use Rust.

Rust is inherently memory safe - however in practical terms this isn't important for most applications. If you are writing security critical applications Rust will provide you with some very important guarantees (ie. there are certain mistakes which are inherently not possible in the language). C++ doesn't really guarantee anything and if you're an idiot you can shoot yourself in the face. However in practical terms memory management in C++11 is very straightforwards and C++11 compliant code (ie. using the STL and not writing it like C) is very safe and clean. You're not mucking with raw pointers anymore

The main issue I see is that Rust is still in early development. It may or may not get "big" in the coming years. And library support is ... lacking

In contrast C++ has the STL and boost and every library under the sun. I haven't working with a lot of other languages extensively, but I've never seen anything as clean, robust and thorough as the STL and boost. C++ will remain relevant for a long long time. If Rust takes off in a big way, you'll be well positioned to jump ship.

Jweb_Guru · on Sept 15, 2014

I definitely understand where you are coming from, and I agree that C++11's ecosystem maturity is a great reason to choose it at the moment.

However, I cannot agree with you that Rust's safety guarantees are not useful for most C++ programs, or that you have to be an "idiot" to do memory-unsafe things in C++11. Someone at Yandex recently did a presentation about Rust [1] in which they pointed to a bit of (completely idiomatic!) C++11 code that caused undefined behavior. The audience, full of seasoned C++ and Java developers, was asked to identify the problem. Not one of them could point to what was causing it (the compiler certainly didn't). The next slide demonstrated how Rust's compiler statically prevented this issue. The issue could have taken weeks to surface and days to track down, and the C++ compiler simply didn't have enough information to determine that it was a problem. This is something that happens over and over to anyone using C++, in any application, not just security-critical ones.

I'm not saying C++11 doesn't improve the situation, because it does--it would be disingenuous to say otherwise. But it's equally disingenuous to imply that C++11 makes memory management straightforward or safe. It does not.

[1] http://habrahabr.ru/company/yandex/blog/235789/ (note: the presentation and site are in Russian).

bjz_ · on Sept 15, 2014

> Someone at Yandex recently did a presentation about Rust[1] in which they pointed to a bit of (completely idiomatic!) C++11 code that caused undefined behavior.

It would be great to have at least that segment of the talk translated. Sounds like a good example.

dbaupp · on Sept 15, 2014

The example:

  std::string get_url() { 
      return "http://yandex.ru";
  }

  string_view get_scheme_from_url(string_view url) {
      unsigned colon = url.find(':');
      return url.substr(0, colon);
  }

  int main() {
      auto scheme = get_scheme_from_url(get_url());
      std::cout << scheme << "n";
      return 0;
  }

ufo · on Sept 16, 2014

Can you say what is the problem here? What is a string_view?

dbaupp · on Sept 16, 2014

A non-owning pointer into string memory owned by someone else (effectively a reference into some string). AIUI, the problem is the temporary string returned by get_url() is deallocated immediately after the get_scheme_from_url call, meaning that the string_view reference `scheme` is left dangling.

shin_lao · on Sept 16, 2014

The string_view pattern is a pretty bad idea and useless with a decent compiler.

dbaupp · on Sept 16, 2014

What do you mean by useless?

If I have a string like "foo bar baz" and I want the second word, should I copy out that data into a whole new string? That seems rather inefficient.

(How is a compiler going to optimise that away?)

shin_lao · on Sept 16, 2014

For small strings, a copy is not only faster but more multithreading friendly.

Keep in mind that on a 64-bit architecture a view is at least 16 bytes large and that small strings can be copied to the stack resulting in better locality and reduced memory usage.

Last but not least, with copy elision, your temporaries might not even exist in the first place.

Example:

    std::string data;
    // ...
    auto str = data.substr(2, 3);
    // pretty sure str will be optimized away
    if (str[0] == 'a')

dbaupp · on Sept 16, 2014

I don't think copy elision[1,2] means what you think it means, it simply allows the compiler to avoid e.g. allocating a new string when returning a string, or avoid allocating a new string to store the result of a temporary. That is, copy elision allows

   std::string str = data.substr(2, 3);
   return str;

to only allocate one new string (for the return value of substr), instead of two. There's no way the compiler can get out of constructing at least one std::string for the return value, especially if there's any form of dynamic substr'ing (e.g. parsing a CSV file with columns that aren't all the same width).

Sharing is only multithreading unfriendly if there's modification happening, and modification of textual (i.e. Unicode) data is bad practice and hard to get right, since all Unicode encodings are variable width (yes, even UTF-32, it is a variable width encoding of visible characters).

Furthermore, a string_view is strictly better than a string for many applications, since a string_view can always be copied into a string by the caller if necessary (i.e. each function can choose to return the most sensible/most performant thing, which is a string_view if it's just a substring of one of the arguments).

The only sensible argument against string_view in C++ I know is: it's easy to get dangling references. Which is correct, but that's a general problem with C++ itself, not with the idea of string views (Rust has a perfectly safe version in the form of &str, which cannot become dangling like in C++).

> Keep in mind that on a 64-bit architecture a view is at least 16 bytes large and that small strings can be copied to the stack resulting in better locality and reduced memory usage.

No, a string_view points into memory that already exists, there's no increased memory usage; a small string copied on to the stack will be part of the string struct, which is at least 3 * 8 = 24 bytes: a pointer, the length and the capacity. Also, a memcpy out of the original string is always going to be more expensive than just getting the pointer/length (or pair of pointers) for a string_view, since the memcpy has to do this anyway.

[1]: http://en.wikipedia.org/wiki/Copy_elision

[2]: http://definedbehavior.blogspot.com/2011/08/value-semantics-...

shin_lao · on Sept 18, 2014

Yeah my example for copy elision sucked, but that doesn't mean it cannot play in favor when you work by value.

Sharing is only multithreading unfriendly if there's modification happening, modification of textual (i.e. Unicode) data is bad practice and hard to get right

Read-only access to data indeed scales "infinitely" on modern architectures.

No, a string_view points into memory that already exists,

Yes. Right. How do you store that? You need at least one pointer and and an int or two pointers. That 16 bytes. Memcpy for a couple of bytes is very quick when it's stack to stack thanks to page locality.

Also, if you are using pointers you will have aliasing issues which will have an impact on performance. If you work by values you allow the compiler to optimize things better.

For small strings string view are just dumb and "most of the time" strings are very small.

To give a better example of why working a string view is both a bad idea and dangerous, it's as if you said "I don't want to copy this vector, therefore I will work on iterators". That's obviously a bad idea.

dbaupp · on Sept 18, 2014

> Yeah my example for copy elision sucked, but that doesn't mean it cannot play in favor when you work by value.

Not just sucked; it was entirely wrong. Copy elision is not related to std::string vs. string_view. Even with copy elision turned up to 11, returning a std::string will be more expensive than a string_view.

> How do you store that? You need at least one pointer and and an int or two pointers. That 16 bytes. Memcpy for a couple of bytes is very quick when it's stack to stack thanks to page locality.

I was very careful to cover exactly this in my comment.

Computing the memcpy is strictly more work than creating a string_view, since you need the information that is stored in a string view (i.e. pointer and length) to call memcpy.

Furthermore, the 'stack string' is actually stored contained inside a std::string value, which is larger than 16 bytes. There is no way that returning a string_view causes higher memory use at the call site than returning a std::string. (If you're complaining that it forces old strings to be kept around, well, you can always copy a string_view to a new std::string if you need to, i.e. a string_view can do the expensive 'upgrade' option on demand.)

Here's the quote from my comment above:

> a small string copied on to the stack will be part of the string struct, which is at least 3 * 8 = 24 bytes: a pointer, the length and the capacity. Also, a memcpy out of the original string is always going to be more expensive than just getting the pointer/length (or pair of pointers) for a string_view, since the memcpy has to do this anyway.

> Also, if you are using pointers you will have aliasing issues which will have an impact on performance. If you work by values you allow the compiler to optimize things better.

You do realise that a std::string contains pointers and so on inside it? Furthermore, the small string optimisation (copying to the stack) means every data access to a std::string includes an extra branch.

> For small strings string view are just dumb and "most of the time" strings are very small.

So instead of just having a cheap reference into a string you're happy with the overhead of a function call (memcpy) and a pile of dynamic branches? I wouldn't be surprised if the branches are the major performance burden for std::string-based code that is processing a pile of substrings of some parent string. In this case, the data from the string_views will normally be in cache anyway (i.e. it will've been recently read by the function that decides who to slice into the string_view).

> To give a better example of why working a string view is both a bad idea and dangerous, it's as if you said "I don't want to copy this vector, therefore I will work on iterators". That's obviously a bad idea.

It's not obviously bad to me. In fact, it seems very reasonable to work with iterators rather than copying vectors (isn't that exactly what the algorithm header does?).

If your problem is that it is unsafe and hard to avoid dangling pointers etc, that's just a fundamental problem of C++ and is unavoidable in that language. One fix would be to use Rust; it handles iterators and string_views safely.

tiles · on Sept 15, 2014

The slides on Slideshare are surprisingly easy to follow (and were a good introduction for me): http://www.slideshare.net/yandex/rust-c

pbsd · on Sept 15, 2014

See slide 42 of STL's recent talk for what I guess will be a similar example: https://github.com/CppCon/CppCon2014/tree/master/Presentatio...

dbaupp · on Sept 15, 2014

Reproduced here:

  const regex r(R"(meow(\d+)\.txt)");
  smatch m;
  if (regex_match(dir_iter->path().filename().string(), m, r)) {
      DoSomethingWith(m[1]);
  }

- What's wrong with this code?

  - Haqrsvarq orunivbe va P++11
  - Pbzcvyre reebe va P++14
  - .fgevat() ergheaf n grzcbenel fgq::fgevat
  - z[1] pbagnvaf vgrengbef gb n qrfgeblrq grzcbenel

(http://rot13.com/ 'd if you want to guess.)

nikbackm · on Sept 16, 2014

Seems like this was fixed in C++14 by adding a std::string&& overload.

http://en.cppreference.com/w/cpp/regex/regex_match

dbaupp · on Sept 16, 2014

The underlying problem is still there, fixing a few of the worst cases in the standard library is helpful but only up to a point. (E.g. anyone with a custom function that does something in a similar vein needs to remember to do the same.)

dbaupp · on Sept 15, 2014

There's more to memory-unsafety than raw pointers; all of these are problems even in the most modern C++ versions:

  - iterator invalidation
  - dangling references
  - buffer overruns
  - use after move (and somewhat, use after free)
  - general undefined behaviour (e.g. overlong shifts, signed integer overflow)

And there's more to memory safety than security critical applications. Rust means you spend a little more time fighting the compiler, but a lot less time fighting the debugger and a lot less time trying to reproduce heisenbugs caused by data races/undefined behaviour.

Of course, the library/tool support is indisputably in C++'s favour.

> if you're an idiot you can shoot yourself in the face

If you're a human you will shoot yourself in the face. It just takes far too much brain power to write correct C++ always (a single mistake leads to brokenness), especially in a team where there can be implicit knowledge about how an API works/should work that may not be correctly transferred between people.

pbsd · on Sept 16, 2014

Iterator invalidation, dangling references, and use-after-move are all essentially the same thing---references outlasting their owner---no need to multiply the issues. Buffer overflows are an issue, yes, unavoidable due to the C legacy.

On the other hand, it's somewhat ironic that you point to overlong shifts as a C++ problem when Rust has the exact same behavior. What does this function return?

    pub fn f(x: uint) -> uint { x >> 32 }

Honestly, I loved the idea of Rust. I was sold a memory-safe C++, and that sounded awesome. But what I got instead was an ML with better low-level support; it felt like an enormous bait-and-switch, as nobody is interested in yet-another-functional-language.

dbaupp · on Sept 16, 2014

Overlong shifts are currently not handled correctly, yes, but they will not be undefined behaviour; they will possibly be implementation-defined but will not lead to memory unsafety.

> use-after-move [...] ---references outlasting their owner---

Not really, e.g.

  std::unique_ptr<int> x(1);
  foo(std::move(x));
  std::cout << *x; // undefined behaviour

Unless you mean something other than `&` references.

> Honestly, I loved the idea of Rust. I was sold a memory-safe C++, and that sounded awesome. But what I got instead was an ML with some low-level extensions; it felt like an enormous bait-and-switch, as nobody is interested in yet-another-functional-language.

Something in this sentence has to be wrong, since people are clearly interested in Rust: either people are interested in YAFL or Rust isn't what you seem to think it is.

Anyway, that just sounds like a 'problem' with your background/expectations and/or whoever sold it to you. Rust is a C++ competitor (i.e. targets the similar low-level space) but it is not definitely trying to just be a C++ rewrite fixing the holes. I don't think there's any official marketing implying the latter.

pbsd · on Sept 16, 2014

My comment appears to have been more polarizing than I ever expected. Let me clear some things up.

In your unique_ptr example, you're right: the reference doesn't outlast its owner, but it becomes a dangerous zombie after getting its guts removed. It is worth mentioning that the behavior may or may not be UB depending on how `foo` takes its parameters: std::move is really just a cast.

Maybe interest is the wrong word to use; many functional languages have generated a lot of interest, but this interest has historically not translated into actual mass usage. Instead, popular languages have adopted certain functional features over time (lambdas, comprehensions, type classes, etc), but have remained fundamentally Algolian for the most part. Rust seems to go in the opposite direction: start with ML (or something ML-like, anyway), and strip it down until it fits into the C++ space.

I am definitely interested in C++ replacements, to be clear. I have explored things that stray from it much more than Rust, such as Haskell and ATS, but I went into those fully expecting to see something different. But look at documents such as [1], and tell me that it doesn't create the expectation that Rust is trying to fit C++'s shoes a little too tightly. Additionally, trawling through mailing list discussions, familiarity with C-like languages seems to have been a design principle since the start (see for example the <> vs [] for generics debate).

Finally, I wasn't (and am not) passing judgment on Rust for being what it is. I was conveying my experience from being excited about it, to being less excited about it after actually learning it. I don't expect a productive discussion to come out of it; I've also seen how defensive the Rust community can be [2].

[1] https://github.com/rust-lang/rust/wiki/Rust-for-CXX-programm...

[2] https://pay.reddit.com/r/rust/comments/2bbeqe/it_started_out...

dbaupp · on Sept 17, 2014

I think the situation of your [1] is that Rust has ended up close enough to C++ that it's useful and meaningful to provide a translation guide between concepts, to help C++ programmers get up to speed more easily; it's certainly not a design document or anything like that. Maybe I'm missing your point. (As I said elsewhere, C++ has had a lot of experience in this space, and so has a lot of good ideas, Rust is not ashamed to borrow them.)

On that note, would you interpret [a] as meaning Rust is trying to be a functional language? The reality is more plagiaristic: functional language have nice features and so Rust borrows some of them. (In my mind the correct interpretation of both documents would be: Rust is a mesh of various languages with enough similarity to many for translation guides to be helpful.)

There has been syntactic decisions tilting towards C++/Java/C# programmers (like the <> for generics), but as far as I can remember those sort of decisions are all minor in terms of semantics. For the most part the actual semantic behaviours are considered in terms of "does Rust need this" rather than "will this move us more towards C++" (even if the feature was inspired by C++).

[a]: http://science.raphael.poss.name/rust-for-functional-program...

pbsd · on Sept 17, 2014

You're right, arriving at that kind of conclusion from the existence of a tutorial is bad reasoning. I had wrong expectations, I suppose.

I must thank you for pointing that link out to me, though: it said what I was trying to say much better than I could in its prologue. Namely, how hard it is to sell a functional language to old-school C people, and how Rust may have a hard time with that (even if it's not a pure functional language).

bjz_ · on Sept 16, 2014

> But what I got instead was an ML with some low-level extensions; it felt like an enormous bait-and-switch, as nobody is interested in yet-another-functional-language.

Rust is not functional. It may draw heavy inspiration from statically typed FP, and closures, ADTs, pattern matching, and and expression-heavy programming style might give that impression, but it is at its heart a procedural systems language. As stated in the blog post, most of Rust's core features map directly to underlying machine instructions, and there is always the opportunity to cede control from the type system if you absolutely have to. Indeed, core library types like `Box<T>` and `Vec<T>` are at their fundamentally built on `unsafe` code.

jfager · on Sept 16, 2014

What do you mean by 'low-level extensions'? There's nothing in the language proper that can't run on bare metal, how much lower can you get?

If anything, it's the functional parts that feel bolted on: closures are crippled (though getting better soonish), the types that get spit out of iterator chains are hideous, no currying, no HKTs, functional data structures are much harder to write w/o a gc, etc.

dragonwriter · on Sept 16, 2014

> But what I got instead was an ML with better low-level support; it felt like an enormous bait-and-switch, as nobody is interested in yet-another-functional-language.

Leaving aside whether I think that's a fair description of Rust, I think plenty of people are interested in a functional programming language without the overhead of GC that is suitable for use as a low-level systems language.

You probably shouldn't write "Nobody is interested..." when what you really mean is just "I am not interested..."

tormeh · on Sept 16, 2014

Well, people tried that with D. Didn't catch on.

kazagistar · on Sept 20, 2014

It caught on a little.

pnathan · on Sept 15, 2014

I've had a vastly happier Rust experience than C++ experience (I've written in both, well beyond the "zomgz 50line starter program").

The Rust compiler is vastly smarter and gets you type checking you have to pay out the nose for in C & C++. I'm a fanboy of Rust, but I would suggest looking hard at Rust for any C or C++ production code going forward. (My default going forward for this space will be Rust unless overriding considerations say otherwise).

jroesch · on Sept 15, 2014

I'm not sure in what world writing C++11 is "very straight forward ... safe and clean". You can easily write unsafe code without thinking about it. For example one can write a function that lends out a reference or pointer to memory that may or may not exist after the call exits, and this is impossible in Rust.

bjz_ · on Sept 15, 2014

To be precise, impossible outside defined `unsafe` blocks of code. It's still important to be able to drop down once and a while if you absolutely need to... you just don't want that ability all the time.

hadoukenio · on Sept 15, 2014

Wow, that was a great comment and exactly the type of info I was after.

I think (coming from a dynamic language world) the memory safeness is what pulls me towards Rust. But from what you say and what I've read elsewhere, that was old-style C++ and not C++1[17].

Thanks!

Shamanmuni · on Sept 15, 2014

Read what the other commenters are pointing out. Maybe the situation is better in C++ now than it was before, but it doesn't mean you can't shoot yourself in the foot, specially for a beginner. Rust was built with safety in mind from the start, there are errors you can make in C++ that the Rust compiler simply won't let.

My advice is, if you're learning it for work, then go with C++. Even if it succeeds, it will take some years for Rust to be mainstream and as pointed out the library support is great.

If you're learning it for fun or for the sake of learning something new. Then Rust is a very nice and promising language bringing things from functional languages that C++ lacks and offering very interesting tooling around it.

Whatever you choose, after you feel confident with one go and learn the other as it will probably give a better perspective in the strengths and/or weaknesses of both.

genericallyloud · on Sept 16, 2014

I would just point out that the impetus for the Rust language was Mozilla looking for a better language to implement a browser in than C++. They obviously have a lot of experience writing C++ and how to do it as good as possible, but found it coming up short, especially as multi-core starts becoming the bottleneck.

ranran876 · on Sept 12, 2014

right? B/c if there is one thing I wouldn't call Elon Musk, it would be "charismatic". Maybe in private he comes of differently