Another pattern I've found helpful is to have branches "exit early"; if you have...

agateau · on June 27, 2018

I have heard this called early-returns. I like to write code this way too. Proponents of one-return-per-function probably write code in languages like C which does not have GC or RAII so having only one exit point makes it easy to ensure all resources allocated by the function have been released.

There are ways to do early returns in C without duplicating resource releasing code though, using gotos (it's one of the few cases where gotos make sense, IMHO). Code looks like this:

  int aFunctionWhichMightFail() {
      int result = FAIL;
      Foo* foo = allocateSomething();
      if (!someFunction(foo)) {
          goto fail;
      }
      moreWork(foo);
      result = SUCCESS;
  fail:
      releaseFoo(foo);
      return result;
  }

The Linux kernel makes heavy use of this.

reificator · on June 27, 2018

> There are ways to do early returns in C without duplicating resource releasing code though, using gotos (it's one of the few cases where I think gotos make sense).

Gotos like that are fine. Despite the title of the paper I don't think Djikstra's meaning was that gotos should be removed entirely.

monocasa · on June 27, 2018

Yeah. Dijkstra was talking about not building your whole program out of if and goto, and using functions and for/while/etc. Particularly he's talking about Dartmouth BASIC that didn't have anything else.

Like, imagine the spaghetti lovecraftian horror of a non trivial program that's just if and goto. At the time, there was a particularly vocal camp that essentially made the argument that it shouldn't matter since if/goto is equivalent to what exists in the machine code anyway, and you can technically do all the same things.

bitL · on June 27, 2018

Keep in mind adopting any strategy beyond if/goto as is in machine code comes with reduced efficiency. So we trade efficiency for readability. Some talented genius might make a super efficient piece of art by restricting to if/goto only, that no other approach could beat performance-wise. I guess anyone who does low-level machine code execution optimization is in that camp most of the time.

monocasa · on June 27, 2018

Not really. You want structured constructs, even in asm, because they play nicer with the speculation and prefetching hardware. Going nuts and throwing away all of the conventions of higher level languages typically kills your perf.

bitL · on June 27, 2018

Well, it depends; if you can employ locality of code, cache lines etc. to your advantage it might pay off doing crazy things to optimize the innermost loop and similar. Though as you said, you'd structure whole blocks of functionality more higher-levelish. On some embedded systems it still pays off to modify code during execution, e.g. replacing a constant in innermost instruction by pre-computed value from higher level to avoid memory access penalty.

astrobe_ · on June 27, 2018

It doesn't make sense to me. the absence of "advanced" control flow constructs is one of the things that define assembly language - only branches and conditional branches, plus sometimes conditional execution of single instructions. Then I don't see how you can claim a non-existing thing can play nicer with speculation and prefetching.

monocasa · on June 27, 2018

There's a lot of optimization at the microarch level towards the kind of code a compiler emits. Modern chips are designed to run C very well, not arbitrary, technically allowed assembly. You'll hit a lot of perf bottlenecks if you throw weird code/data flow graphs at them.

astrobe_ · on June 29, 2018

Well, I'd like to see that. Instruction sets designed with high level languages in mind, sure; but speculation and caches disturbed by code that is not shaped like what a compiler does (which compiler, anyway?), that's doubtful to me.

a1369209993 · on June 27, 2018

Djikstra's meaning was that gotos should be removed entirely. By replacing them with the high-level control-flow constructs them were being used to implement. If your language doesn't have a high-level equivalent of `goto fail`... well Djikstra would probably say you shouldn't use that language in the first place, but given that you are, it's no worse than using `jlt .forloop` in assembly.

s-shellfish · on June 27, 2018

You are going to wind up with some kind of 'goto' regardless, whether it's having to navigate through independent files or various other constructs one takes for granted as being immutable. The point being, when you notice them, automate them out of the reasoning space so you don't drown in the complexity of one mundane detail being used to represent the overall state of the program (all the stuff you have to shove in your head to understand what your program is doing).

These things are hard to notice, and even harder to find elegant and effective solutions!

The point is to reduce the problem space so all you are left to look at, when you do come back to your code, is what's important to reason about without removing important details or adding trivialities. These problems haven't gone away just because the literal word goto isn't used anymore.

sago · on June 27, 2018

> If your language doesn't have a high-level equivalent of `goto fail`

Such as?

I am aware of writing this in different ways, of architecting the logic quite differently. But the only control-flow construct I've seen as alternatives are to abuse exceptions or labelled while with named breaks.

What were you thinking of when you wrote this? I'm genuinely interested.

reificator · on June 27, 2018

Golang's defer statement gives some of that power, (and some extra guarantees) but doesn't as far as I know allow return statements.

You can however give named return values and reference those names, and I believe it will work.

I've not seen any other constructs in any language that would come close, and I still maintain `goto fail` is a reasonable construct.

mjevans · on June 27, 2018

Deferred statements (in go) are really useful for delaying the runtime of blocks of code to the end of a context, but leaving them in the logical context where you're setting up a resource.

I'm actually not sure offhand how multiple deferred statements are handled. I could envision that the specification might make no explicit guarantee about the runtime order or it might create and pop off of a stack (at least in behavior).

sago · on June 27, 2018

Thank you. For clarity, here would be the GP example:

   Foo foo = acquireFoo()
   defer releaseFoo(foo)

   if !isValid(foo) {
      return FAIL
   }

   moreWork(foo)
   return SUCCESS

That is impressively clear. I don't use go, myself.

I agree, I am quite comfortable using goto in C/C++ in this context. Though it may be better in C++ to use RAII in this specific example.

msla · on June 27, 2018

Common Lisp has `with-open-file`, which guarantees files get closed regardless of what happens in the body, even if an exception is thrown (the macro closes the file and then propagates the exception in that case):

    (with-open-file (fhandle path)
        (if (has-record fhandle)
            (process-record (get-record fhandle))))

Python basically has this, but Python doesn't give you the macros needed to write your own for other kinds of resource.

http://clhs.lisp.se/Body/m_w_open.htm

> If a new output file is being written, and control leaves abnormally, the file is aborted and the file system is left, so far as possible, as if the file had never been opened.

Shades of PCLSRing!

sago · on June 27, 2018

> Python doesn't give you the macros needed to write your own for other kinds of resource.

Python doesn't use macros, true. But it absolutely gives you the ability to do 'with' for other kinds of resource (and anything else). Look for the with-statement and documentation on defining your own context manager.

lokedhs · on June 28, 2018

with-open-file is just a convenience macro around the underlying functionality which is provided by unwind-protect which is pretty much identical to try/finally in Java.

The ease of writing macros such as with-open-file means that most libraries that need to have similar types of features for their resources that needs releasing also provides similar functionality.

lawtguy · on June 27, 2018

Java and C# have the `finally` block which is essentially the `goto fail`. Typically it is used like:

  try {
    doSomething();
  }
  catch (Exception e) {
    handleException();
  }
  finally {
    cleanUpResources();
  }

When the code exits the try block, you are guaranteed that it will run the finally block, possibly entering the catch block first if an exception was thrown. Since the most common use of `goto fail` pattern is freeing memory, the `finally` block isn't actually used a lot in Java/C# code in practice.

sago · on June 27, 2018

Thank you. Since the example was nothing to do with exceptions, and we don't want to abuse exceptions by using them when they are not needed, I will rewrite your example to cover the GP case:

   Foo foo = acquireFoo();
   try {
       if (!isValid(foo)) {
          return FAIL;
       }
       moreWork(foo);
       return SUCCESS;
   } finally {
       releaseFoo(foo);
   }

Python could also use the same approach. Though it also has context managers / the with-statement for this specific application.

a1369209993 · on June 28, 2018

> What were you thinking of when you wrote this? I'm genuinely interested.

That there is no high-level equivalent, and Djikstra, while he sometimes had useful insights, was ultimately a self-righteous twit.

Sorry to disappoint; I would also be interested a (not-even-worse-than-goto-like-exceptions-are) answer to your question.

reificator · on June 27, 2018

I could be misunderstanding here, but I'm fairly sure that in the spirit of the paper `goto fail` would be preferable to throwing an exception and unwinding the stack at any arbitrary function call.

beagle3 · on June 27, 2018

Dijkstra's original title was a "a case against the goto statement". The clickbait title "goto considered harmful" was chosen by the editor of the CACM

[0] https://www.cs.utexas.edu/users/EWD/transcriptions/EWD02xx/E...

octorian · on June 27, 2018

I've extensively written code like this, though I don't seem to see others do it very often. I suppose if you really hate gotos, then a "do { } while(0);" with a "break;" in the middle of it is equivalent.

There's just too much cleanup, such that avoiding gotos means you're either nested 200 columns into the screen or have half your function filled with partially-duplicated cleanup code.

megous · on June 27, 2018

gcc has __attribute__((cleanup)) extension for this:

https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attribute...

artpepper2 · on June 27, 2018

I used this pattern a lot writing Win32 / COM code.

hinkley · on June 27, 2018

Somewhere, in the long dark ago, a bunch of people got upset about long functions with return statements peppered throughout.

So we ended up with a Best Practice of one return clause per function. Which is triply stupid because 1) long functions are part of the problem, 2) break to the end with multiple assignments to the same variable is almost as bad. It’s just trading one stupid for another which is a waste of time and energy.

And then for many years a loud minority of us, some of whom actually have a vague understanding of how the human brain processes data, insisted on the pattern you describe here. Some of us call it “book ending”. All the returns are fast or proceed to the end. Any method that can’t accomplish either has too high a complexity anyway and gets split until you get either one return, or book ends.

[another responder calls them guard clauses, but this doesn’t quite apply in all situations. A clause that handles a default or a bad input is why you can rearrange, but it misse why you should: ergonomics]

munificent · on June 27, 2018

The single return best practice comes from the old days of C. Back then, compilers didn't inline function calls as aggressively as they do now, and the overhead of a call was a relatively significant fraction of execution time. That led to longer functions being a good idea.

At the same time, C does not have any features for reliably managing resources. No GC. No C++'s RAII. Instead, you have to be careful to release every resource you acquire in the body of a function, along every execution path.

In that world, avoiding multiple returns makes a lot of sense because it ensures every execution path reliably reaches the end of the function so it's easier to ensure resources are released.

We aren't in that world anymore, so the guideline no longer applies. But it wasn't a stupid rule. It's just unhelpful to take it out of its original context and expect it to apply to a new one.

"Check your boots for scorpions before putting them on" isn't a dumb rule in the desert, but it is in Antarctica.

hinkley · on June 27, 2018

Ah yes, that's true about C. I think my aggressive distaste comes from seeing people cargo cult this into languages like Java, Python, and Ruby.

Makes no sense to turn your code inside out for resource management when you picked a language where you don't have to turn your code inside out for resource management.

flukus · on June 29, 2018

Even in C these kind of early returns can be good. In the example above you would be returning before allocating any memory. This makes the memory you do have to manage simpler.

wwweston · on June 27, 2018

> Somewhere, in the long dark ago, a bunch of people got upset about long functions with return statements peppered throughout.

I got taught this matter of style in the context of 90s C/C++. The general idea was that many return points likely means many point of having clean up any state you've been trying to manage/encapsulate with the function... in particular the state of manually managed memory. Which probably means mistakes.

It's a reasonable argument for manually managed memory, loses much of its power once you're not. And these days, most of us are not.

tudelo · on June 27, 2018

I believe this is commonly referred to as a guard clause? I tend to prefer writing things this way too... but I could see an argument for putting something in an else block instead of just leaving it there.

harimau777 · on June 27, 2018

Personally, I find the first version easier to read since it makes the structure explicit. I.e. there are two blocks and you will execute one or the other.

To me the second version makes it more difficult to differentiate the structure:

  if (...) {
    foo()
    return ...
  }

  return bar()

from:

  if (...) {
    foo()
  }

  return bar()

I find that when I'm reading code I'm generally more interested in the overall structure than in the specific implementation details. Especially since if the first block was complicated enough to tax my cognitive overhead, then I would probably extract it into a function anyways.

joshuamorton · on June 27, 2018

There's a reason the pattern is referred to as "exiting early". Another description might be using guards. Basically, you shouldn't do this if you're doing something complex Basically you can imagine each of the if statements being a "guard" against an condition that might cause you to exit early. This becomes much clearer if you have multiple such clauses, and add the mental note that each conditional should contain only a return statement with a single expression.

    function foo(bar) {
        if is_null(bar) {
            return TypeError
        }

        if !validate_input(bar) {
            return ValueError
        }

        if cached(bar) {
            return cache(bar)
        }

        // Added as a workaround for client XYZCorp since...
        if (bar == "A special value) {
            return "A special result"
        }

        return do_a_complex_thing_to(bar)
    }

You could equivalently structure it as a bunch of else ifs, but there's no real reason to do it that way.

dwc · on June 27, 2018

I like and use early exits when they fit well, which is often. And in this case I'd probably do as you suggest. But it's worth mentioning that even if you stick with if/else it's often worth it to put the simple case first to reduce mental load:

    if cache.contains(key) {
      return cache.get(key)
    } else {    
      v = db.query("..")
      expiry = calcexpiry(v)
      cache.put(k, v, expiry)
      return v;
    }

muxator · on June 27, 2018

This example makes me whish an early return even more. Wouldn't it be sweet?

pasta · on June 27, 2018

This. I would amost say: Don't use 'else' (if you can).

I also would like to add that it helps to fail early.

The article shows a null check for an add function. But the question is: why not throw an exception when the input is not what you expect.

This helps to find bugs very early.

munk-a · on June 27, 2018

I think 'else' shouldn't be shunned where appropriate, I usually like a lack of else whenever checking function pre-conditions while as I'll include an else (even with both branches returning) if the two branches represent comparable operations. English can be horribly abused to fit anything, but usually saying it out loud can help "If the user id isn't numeric - we're done. If the user doesn't have access - we're done. If the user wants to view the resource in purple shift, let's format it like so and give it back to them, otherwise we'll use a yellow shift and hand it off."

mdpopescu · on June 27, 2018

If it's a bug, sure. However, it might simply be a "not found" - for example, you're trying to decide whether to insert or update a record. You look for an existing one and don't find it - you shouldn't throw here, because "record not found" is a valid outcome.

sigotirandolas · on June 28, 2018

Unless it’s a one-off thing, it’s also worth considering adding a cache class. This can have the advantages of further decoupling the cache and computation logic, and being able to switch the cache implementation (e.g. to use a no-op cache for debugging). It also hides the “if” by introducing a “cache pattern”:

    return dbcache.getOrCompute(key, function() {
        v = db.query("..")
        expiry = calcexpiry(v)
        return (v, expiry)
    });

yoodenvranx · on June 27, 2018

Thank for posting this! I recently had quite a few discussions about this topic with people who insist to _always_ use if/elseif/elsif/... constructs in such situations. I tried to argue that in most situations a "if (a) {return}; if (b) {return}; do_c()" construction is much easier to understand but I had only little success convincing them.

Flow · on June 27, 2018

I always say that if you know the answer, why convolute the code? Just return the damn answer and we're done here.

I see these early returns as a list of assumptions. It helps making them nicely lined up, with perhaps some comments before each, and an empty line between them.

And then when you get below to the, now non-nested, code that actually does some work you have a smaller mental burden needed to understand what is actually done.

sbmassey · on June 27, 2018

Personally, it is better to have functions small enough that they fit on a handful of lines, and the if-else structure makes it obvious that you are doing one or another thing, rather than having separate if structures at the top of the function that may-or-may-not exit before you reach the main body of the function.

adamtulinius · on June 27, 2018

Your code should be written this way instead:

    if !cache.contains(key) {
      v = db.query("..")
      expiry = calcexpiry(v)
      cache.put(k, v, expiry)
    }
    
    return cache.get(k)

Only one return, and everything is still in the logical order.

recursive · on June 27, 2018

I don't think there is anything virtuous about a single return in languages that have catch, finally, and garbage collection.

acdha · on June 27, 2018

I think it’s a judgement call: I like clearly indicating when there’s always a single output path but wouldn’t use the single return style in that cache example if it involved an extra lookup operation for a value which is already known.

jakewins · on June 27, 2018

Yes - the cache example is bad; my point is the general idea of avoiding branches that merge back into the main function.

In your example above, the branch does merge back, and clearly that's fine for a small function - and common pattern - like this. But the general idea would be to avoid it if possible, because that removes unnecessary complexity.

andy_boot · on June 27, 2018

I have used this pattern too. I like it.

One small problem is that depending on how your threading and cache works it might be possible for the cache to be cleared after the contains call but before the get.

jrumbut · on June 27, 2018

Or for the put operation to finish first every time on development machines and only sometimes in production, thus creating a very tricky bug.

Someone · on June 28, 2018

IMO, both examples are bad because they search for the key twice.

That’s inefficient and, worse, introduces a concurrency bug. The cached value may expire between the check using ‘contains’ and the call to ‘get’.

A good API would have something akin to C#’s TryGet (https://msdn.microsoft.com/en-us/library/bb347013(v=vs.110)....).

abiox · on June 27, 2018

imho, a get() from cache when you already have the value seems like an unnecessary waste of cycles and network traffic.

plus this arrangement would seem vulnerable to a pathological case where you found the V from the db, but have a full/failed/partitioned cache and end up faulting or otherwise not returning the V.

rasjani · on June 27, 2018

I’ve used Python’s decorators for similar pattern. Syntax and scope is bit different but same structure in general level.

lanfeust6 · on June 27, 2018

Good practice, something I tend to catch only after writing something inefficient.

mozumder · on June 27, 2018

These things really should be optimized by the compiler. No reason for it not to.

Retra · on June 28, 2018

This is an optimization for readability, not execution speed.