Hacker News new | past | comments | ask | show | jobs | submit login

Could you please elaborate on No. 11?

"Never use early exits". Could you please elaborate what you mean by that?

PS. I'm not trying to unintentionally start a flame war. Just trying to understand.




Excellent question that is difficult to answer. I'll give you a short response here and then write a blog post with example code when I have time. Put your email in your profile and I'll make sure to let you know when that's ready.

A little background: I have gotten many calls when legacy code has a bug or needs a critical enhancement and no one in-house is willing or able to figure it out. I'm no smarter than anyone else, but because I'm stupid and fearless, I often do something that few others attempt: I rewrite or refactor the code first, then work on the bug/enhancement. And the first thing I always do is look for early exits. This has always given me the most bang for my buck in making unintelligible code understandable. The only entry to any function or subroutine should be on the first line and the only exit should be on the last line.

This is never about what runs fastest or produces less lines of code. It's strictly about making life easier for the next programmer.

Early exits can make things really easy (I'll just get out now.) 20 lines of clean code. No problem. Until years later, when that function is 300 lines long, and the next programmer can't figure out what you're doing. Much of the maintenance had been done in emergency mode, each programmer just trying to get in and out as fast as they could.

Early exits make it much easier for bad things to evolve:

  - 200 line multiple nested if statements
  - 200 line multiple nested recursions
  - unidentified but critical "modes" (add/change) (found/notFound)...
Removing early exits forces you to understand what really should be happening and enables you to structure code into smaller, more intelligible pieces.

Short, hand-wavy response, but I hope that helps clarify some. Stay tuned for a better answer...


I'm not sure if I would agree on this. Take for example the Linux kernel. You have basically early exits just everywhere. Take any random file from here: http://git.kernel.org/?p=linux/kernel/git/stable/linux-stabl...

It often goes like this:

  check if we have already done this or so -> exit
  check if this is possible to do. if not -> exit
  do some first thing. on error -> exit
  do some other thing. on error -> exit
  ...


This coding style is also referred to as "guard clause", embraced by e.g. Martin Fowler (http://martinfowler.com/refactoring/catalog/replaceNestedCon...) and Jeff Atwood (http://www.codinghorror.com/blog/2006/01/flattening-arrow-co...) (and by me ;-))


He also said his statement "This is never about what runs fastest...". Kernel code needs to run fast and take advantage of shortcuts.

IME, the jobs most programmers are doing don't need to try to accomplish maximum speed or need to wring a few bytes out of RAM. Certainly we don't want to be wasteful, but long-term maintainability is more important, again IME, than absolute speed or minimising memory footprint by a few (k) bytes.

Back in the bad old days when I was writing programs to run on mainframes, yeah, we did need to fight for every byte. A $5 million machine back then had less RAM and less raw CPU power than a tablet does today. We don't live in that world now.


> He also said his statement "This is never about what runs fastest...".

This style has nothing to do with running fast, it has to do with lowering the cognitive load of the rest of the function. A branch means you have two states to keep in mind (the state where the branch is taken, and the state where the branch is not taken). Without early exit, you have to keep both states in mind until the end of the function just in case.

With guard clauses, you can discard on of the states (the one which matched the clause) entirely.


Computers can do that repeatedly and reliably with even (hundreds of) thousands of discarded states. People reading code and trying to understand what's actually in play at line umptyfratz of function iGottaGetThisFixedNow(), not so much.

In the end, I'm not disagreeing with early exits per se, just that over time they can make it more difficult to understand function because assumptions about state have to adjust as a maintainer goes through the code. Those assumptions may have been crystal-clear to the writer originally but how many times is the original writer the only maintainer?


> And the first thing I always do is look for early exits. This has always given me the most bang for my buck in making unintelligible code understandable.

Ouch. It will be kind of amusing if we ever work on the same code - I frequently start a bug fix by refactoring to introduce as many early exits as possible. I find guard clauses so much easier to understand than nested conditionals that sometimes refactoring it like this is the only way I can understand the code at all. I would love to see a blog post where you compare different styles.

OTOH, If you're talking about things other than guard clauses then I think we might have a much more similar viewpoint.


> when that function is 300 lines long

This is what I would focus on avoiding instead.


What if you fail on line 20? Carrying on may cause cascading errors, what alternatives are there to return an error code directly apart from storing a return value in a variable, use a goto to the end and return once? (assuming this is C)


But, a computer scientist once argued, multiple returns are structured: they return to the same point. Knuth wrote an article on structured programming with GOTOs, and pointed out that this can save memory.


I disagree because I find that it makes my code more elegant. A lot of smart developers I respect have also embraced early exits. But then again I haven't yet worked with large legacy codebases.

I'm still learning so I'd be grateful for a heads up if you make that blog post. I've put my email in my hn profile.


Also early exits are the exact same things as GOTOs. It equals to:

  goto end;
  ...
  end:
It should help people figure out why it's bad - in 99.99% of the cases.


I think you misunderstand why goto's are frowned upon. Goto's are bad because they allows you to jump to anywhere in the code, regardless of structure, which undermines all assumptions you can make about control flow.

Function calls and return are like goto that obeys structure, and therefore don't have the same problems.


Goto's are bad because they allows you to jump to anywhere in the code, regardless of structure, which undermines all assumptions you can make about control flow.

Knowing where you are jumping doesn't help you to make assumptions about the control flow.

Goto's are bad because they allows you to jump. The jump in itself is the problem because it breaks the instruction flow arbitrarily - without explicitly expressing the boolean condition for it. Early exits are of the same kind: they don't express explicitly the boolean condition of the jump. We know where we are jumping. Not why. With time, the boolean equation of the code which determines the instruction flow is unmaintainable. And then you end up not understand where your flow is going through, not because you don't know where a jump is going, but because you have lost why.

Most gotos, early returns, breaks and continues (C speaking) are considered to be bad habits for this reason.

return only goal is to return values to the function caller. Not to jump.

Function calls jump back to whatever call them so it's like there has been no jump at all in terms of instruction flow - you basically can continue to read the code assuming some code has been executed behind the function name.


While loops are the exact same thing as GOTOs. It equals to:

    start:
    if(!p) goto end
    ...
    goto start;
    end:
It should help people figure out why it's bad - in 99.99% of the cases ;)


A loop is based on a conditional jump (aka. a flow control statement in imperative langs) which makes all the difference. The problem with jumps (is usually called a jump, purely arbitrary jump) is that they are arbitrary: http://news.ycombinator.com/item?id=4627730


How do know what happens at an end-brace? Have to go look at the matching start-brace to see if it is a while or for or if. Madness! Can I assume the code above even executed once? No! What of I check the guard clause first? It has variables! Oh heavens, control flow depends on arbitrary RUNTIME data!?


I believe early exit is strictly better than `goto end`. There's no language enforcement requiring the "end" label to be at the end of the function. If there was, I wouldn't mind the use of `goto end`. I don't think your argument is persuasive to anyone who doesn't already agree with you.


Indeed. I have found "guard clauses" with return/error at top of function to be clearer and less error prone.


I believe he referred to the rule to use a single return statement at the end of any single function definition.


I'm glad you asked, I have the same question exactly.


I think the biggest problem is that early exits make you forget you have to clean things up before returning (a bigger problem in C and sometimes Java, than it is in modern dynamic languages). There are certainly other reasons, but I'll leave them to others that are more experienced!






Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: