Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Uncle Bob's insistence that functions should be 2-4 lines long is baffling to me. I don't understand how he can be taken seriously. Is there a single application in the entire world with substantial functionality that conforms to this rule?


I've seen this in what I call 'lasagna code' - multiple thin layers that seem to do nothing (or almost nothing) but each one is an implementation of some abstruse interface that exists in the mind of the original developer.

Eventually, your code has to do something. Get that thing in one place where you can look at it in its whole.


John Carmack would disagree with Uncle Bob and John Carmack actually programs.

My own experience is that with an IDE that can collapse a new scope in the middle of a function, you can make large functions that accomplish a lot and are very clear by writing a comment and starting a new scope.

If something is going to be called multiple times a new function makes sense, but this idea that anything that can eventually return a single value needs to be it's own function is a giant pain that creates more problems than it solves.

Just makeing a new scope in the middle of the functions lets you use all the variables in the outer scope, do transformations without introducing new variables and ultimately "return" a new variable to the outer scope.

I've never understood why polluting namespaces with dozens or hundreds of names (most of which may not be very descriptive since naming a hundred small things is already going to be confusing) is seen as a good idea. You look at a list and you have no idea what is important and what was being shoved in there to satisfy some do nothing public speaker's arbitrary rules.


The problem with collapsing is that you need to know a priori which sub-scopes are independent and hence collapsible and which aren’t. Meaning, you have to analyze the unfamiliar code first in order to know which sub-scopes you might want to collapse. And, given an already-collapsed sub-scope, due to it being collapsed you can’t see if it reads or mutates some local variable. The benefit of extracted functions is that you know that they are independent of the implementation details of the caller (i.e. can’t possibly depend on or modify local variables of the caller other the ones passed as arguments).

Too many arguments can become a problem. Nested functions can help here, because they allow you to move their implementation “out of the way” while still having access to shared variables. And sometimes a collection of parameters can sensibly become its own class.

IDE affordances are fine, but I’m opposed to requiring reliance on them for reading and understanding code, as opposed to writing.


you need to know a priori which sub-scopes are independent and hence collapsible and which aren’t.

What does independent mean? I would just collapse them all because they were meant to be collapsed in the first place.

you have to analyze the unfamiliar code first in order to know which sub-scopes you might want to collapse

Unfamiliar? I wasn't refactoring and collapsed them all.

The benefit of extracted functions is that you know that they are independent of the implementation details of the caller (i.e. can’t possibly depend on or modify local variables of the caller other the ones passed as arguments).

That's true to an extent, but this is more of a way to make monolithic functions simple, which then makes the program simpler over all because you can avoid lots of tiny little functions. What you can end up with is programs that do non trivial things but don't have tons of functions confusing the issue.

Pragmatically this isn't really a problem. The whole "it isn't exactly a 1:1 replacement" isn't the point. You can still put in comments and const references if you really want to.

IDE affordances are fine, but I’m opposed to requiring reliance on them for reading and understanding code, as opposed to writing.

Why would it be required? The alternative is that you still have these commented sections with their own scope but they aren't collapsed. You can always work on it without the IDE and when you go back to the IDE it still works.

The reality of it is that you can see a broad overview of a function then see details one section at a time and you don't even have to go skipping around to other parts of the file or other files to do it.


Too often I see functions that are shells that reshuffle the arguments and pass them to another function, which also reshuffles the arguments and forwards them to another, and on and on. One was 11 layers deep.


I have seen such code too - with just S,K and I combinators. It wasn't readable.


And a lot of people doesn't understand how dangerous shuffling parameters is, especially in languages that do not have named parameters...


Powershell is an awful language but it has made me fall in love with named parameters. Name all the things.


If you have a 2-4 line function and you spot this, you can trivially remove it.


Yes. This works but only if the functions are pure and using pure function composition.

Uncle bob doesn’t mention this.

   createspecialString(y) =
       Capitalizefirstletter .
       MakealllowerCase .
       AddNumberSuffix .
       removeLetterA .
       removeLetterB .
       ConcatwithWord(x)

   CapitalizeFirstLetter(a) = a[0].upper() + a[1:]
   MakeAllLowercase(a) = map(a, (t) => t.lower())
   Addnumbersuffix(a) a + 3.toString()
   RemoveLetterA(t) = filter(t, (s) => s.lower() == “a”)
   RemoveLetterB(t) = filter(t, (s) => s.lower() == “b”)
   ConcatenateWithWord(x) = (y) => y + x

   
There see? It’s mostly doable in pure functional composition where the dot represents function composition. I program like this all the time. No way anyone can pull this off while mutating state and instantiating objects.

   F . P = (x) => F(P(x))
Forgive some inconsistent formatting and naming im typing this on my phone.

People who complain about this style tend to be unfamiliar with it. If you had knowledge about procedural coding styles and a function composition approach like this then usually this style is easier as the high level function literally reads like English. You don’t need to even look at the definitions you already know what this complicated string formatting function does.

No comments needed. And neither author tells you about this super modular approach. They don’t mention the critical thing in that this style requires functions to be pure.

Thus to get most of your code following this extremely modular and readable approach… much of your code must be minimizing IO and state changes and segregating it away as much as possible.

The Haskell type system, the IO monad is pushing programmers in this direction.

Again neither author talks about this.


Based on his blog, Martin has been getting into Clojure in recent years. I was kind of hoping that the experience with a functional lisp would shift some of opinions that he previously stood by in Clean Code, but based on this discussion, it doesn't seem like it.


`createspecialString` is seven lines long though.


Then split it. All pure functions are easily decomposed into the most primitive units so even the most dogmatic ass hole can't talk shit.

   createSpecialString = createFormattedString . createNewString

   createFormattedString = 
       Capitalizefirstletter .
       MakealllowerCase .
       AddNumberSuffix .


   createNewString(y) = 
       removeLetterA .
       removeLetterB .
       ConcatwithWord(x)
Or put it all on one line.

      createspecialString(y) = Capitalizefirstletter . MakealllowerCase . AddNumberSuffix . removeLetterA . removeLetterB . ConcatwithWord(x)
The amount of lines becomes off topic once you get into this style. It's a completely orthoganol concept as it's completely irrelevant to readability and modularity.

Lines doesn't makes sense for pure non imperative functions. Lines ONLY make sense for imperative functions because each line represents an instruction.


Well, the most interesting thing about purely functional composition would be the ability to un(de)compose them, inlining the bodies until the resulting function is large enough to be worth the effort of reading it.


Exactly you have the lowest level primitives. You can arbitrarily compose them however you want forming arbitrary layers of abstraction from a tree of compositions.

In the first example there is one layer. In the second there is 2 layers of abstraction formed by composing the primitives into a tree.


There are. A lot of Java code bases look like this.

It is all as bad as you imagine. Functionality is spread out all of the place so it is very difficult to reason about how it all hangs together.


I once fully spelunked such a Java call stack to convert some code to golang. It was amazing, there were like 5 layers of indirection over some code that actually did what I want, but I had to fully trace it keeping arguments from the full call stack in mind to figure this out, because several of the layers of indirection had the potential of doing substantially more work with much more complex dependency graphs. I ended up with a single go file with two functions (one that reproduced the actually interesting Java code and one that called it the way it would have been called across all the layers of indirection. It was less than 100 lines and _much_ easier to understand.


It's always fun stepping through 412 stack frames that are all 2-line long methods to figure out where the thing you're interested in actually happened.


Add in Go and C++ code bases to that as well.


Yes, I've worked on a couple of codebases like that. It's glorious, you break everything down little by little and every step makes sense and can be tested individually. Best jobs I've had.


But are those steps actually doing anything that can be tested? My experience with these sorts of codebases was always that most of the functions aren't doing much other than calling other functions, and therefore testing those functions ends up either with testing exactly the same behaviour in several places, or mocking so heavily as to make the test pointless.

Or worse, I've seen people break functions apart in such a way that you now need to maintain some sort of class-level state between the function calls in order to get the correct behaviour. This is almost impossible to meaningfully test because of the complex possible states and orders between those states - you might correctly test individual cases, but you'll never cover all possible behaviours with that sort of system.


> testing exactly the same behaviour in several places

I think that's actually fine. Particularly if you're doing a testing pyramid style approach where you do a lot of tests of some piece of low level logic and then a few tests of the higher level piece that makes use of that lower level logic, I don't see any problem with the higher level test covering a codepath that's also used in the lower level test. If anything I find it makes it easier to understand and debug failures - you know that the behaviour of the lower level piece hasn't changed because otherwise the lower level test would have failed, so the bug can only be in the higher level component itself.


These large compound statements look nice if they are perfect, but when you make giant expressions without intermediate variables, it is much more difficult to test.

When you have small expressions that have incremental results stored in variables, you can see the result in a debugger so you can see each stage.


Like with a lot of his approach, its great for teaching people better coding skills in an educational setting, but doesn't make as much sense in the real world.


It is a bit weird.

However, a friend of mine was a professional Smalltalk programmer. He claims that his median line count of methods, over his 17 year career, was 4.

It is harder to do in other languages--it seems that C would be on the order of 10.

Clearly it is a rule that can lead to complexity of too many methods, compromising whatever gain smaller methods give you.


> median line count of methods

Auto-generate getters and setters for every instance variable and that will drag the average down. (Maybe a lot of those getters and setters should not have existed.)


Not part of smalltalk.


fyi the Cincom Smalltalk IDE New Class dialog shows these fields and checkboxes—

     Name: MyClass
     Superclass: Core.Object
     Instance Variables: anInstVar anotherVar

     Create methods:
         Accessors
         Initializer
         Subclass responsibilities

"Accessors" aka getters.

    ~
See 1996 "Smalltalk with Style"

page 113 get method

page 117 set method

https://rmod-files.lille.inria.fr/FreeBooks/WithStyle/Smallt...

    ~
"After the creation of the class StUser, it is highly recommended that you perform automatic creation of instance variable accessors. One possible way to do this in Squeak is to use the context menu on a class among the entry more.... There, you can find create inst var accessors. Select this command to have all accessors created."

http://www.hpi.uni-potsdam.de/hirschfeld/seaside/tutorial?ch...

etc etc


You should be baffled, because I never presented that as a hard and fast rule.


fwiw Lines-of-Code / Method (Digitalk V/286 image, 1989) = 7

https://dl.acm.org/doi/10.1145/74878.74904


This is something that's easier to read than it is to understand. A lot of languages force you to do quite a lot in a function and becoming blind to bloat is way to easy to do. (C++/Go/Java/etc yep).

He did an example in the article of:

void concurrentOperation() { lock() criticalSection(); unlock() }

So if you subsitute criticalSection with a lot of operations, such as open file, read lines, find something, close file. I think you have a better representation of an over bloated function.

Scala has the langauge support to show what this could look like.

What you're doing in that method is starting a critical section, doing something, and then ending a critical section. It's a good suggestion to break that with:

def criticalSection(f: () => Unit) { lock() f() unlock() }

How you have a single method that does one thing and is easy to understand. Also it's reusable.

The original code would be used as:

criticalSection { _ => doSomething() }

That replacement is now longer dependent on locking. Locking is layered in.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: