I disagree. If null was like Maybe, than you should be able to do:
E foo;
...
foo=foo.bar().bat()
without having to do a null check after each function.
I think that Java's approach has the worst of both worlds, because there is no way to make foo.bar().bat() safe when a function could return null.
In C, for example, calling a method on a null object does not cause an error. Rather it passes in null as the 'this' value, allowing you to do you null checks within the method.
I agree that C handles this much better than java. But your description is inaccurate. In C there are no methods, objects or 'this', which is far better.
This article makes it seem like you'd have to explicitly check whether a Maybe value is Nothing when you use it. This is certainly safe, but it's also very awkward; as a contrived example, adding two numbers would look like this:
case a of
Nothing -> Nothing
Just a -> case b of
Nothing -> Nothing
Just b -> a + b
This is quite a bit of boilerplate hiding the expression that actually matters--a + b! Moreover, whenever you have code that creeps steadily to the right, it means you either messed up or missed an abstraction.
It turns out that this pattern--do a computation if all the values are present, but return Nothing if any of them are Nothing--happens very often. Happily, we can get some nice syntax for Maybe computations like this using do-notation in Haskell or for-comprehensions in Scala:
do a <- a
b <- b
return (a + b)
This is much better! It makes even more sense for more complicated expressions, especially when later results depend on values of earlier ones. However, for a simple example like a + b, it's still quite a bit of boiler plate; we can certainly do better! Here are two alternatives using functions from the Control.Applicative module:
(+) <$> a <*> b
liftA2 (+) a b
The important idea here is that both versions somehow "lift" the (+) function to work over Maybe values. This just means they create a new function with checks for Maybe built in. This is great because it saves all the boilerplate above and nicely abstracts away most of the null checks while preserving safety. But the syntax is still a bit awkward. Happily, if you don't mind using a preprocessor[1], you can get some very nice syntax called "idiom brackets":
The computation inside the (| and |) is lifted over Maybe, just like the two previous examples. I think this is the clearest option here: it has the least syntactic overhead, and the base expression--a + b--is very easy to read. They also have the advantage of nesting, so you can express a + b + c, where you want a null check for all three variables, as:
(|(|a + b|) + c|)
This isn't perfect, but I think it's still very easy to follow. It might be better if the (| and |) were a single character, something like this:
⦇⦇a + b⦈ + c⦈
However, some people really don't like Unicode symbols in their code :(. Happily, you can have the source look like (|foo|) and have Emacs replace it with ⦇foo⦈ without actually changing the code. It's basically Unicode syntax highlighting. I think this leads to the most readable code so far.
So my main point is that you can abstract out the common case where you check for Nothing and make the whole expression Nothing if any sub-expression is. This saves quite a bit of typing and much more importantly makes the resulting code far easier to read.
Another really cool part is that all these syntax forms and functions are not specific to Maybe--they actually work for a whole bunch of different types. So you would not be bloating your language by including special features just for safely checking nulls; these features are much more general.
You can also make Maybe an instance of various typeclasses for even nicer syntax:
instance Num a => Num (Maybe a) where
(+) = liftM2 (+)
(-) = liftM2 (-)
(*) = liftM2 (*)
abs = liftM abs
signum = liftM signum
negate = liftM negate
fromInteger = Just . fromInteger
> Just 4 + 2 * Just 6
Just 16
> Nothing * 42
Nothing
Notice how the fromInteger method allows you to freely mix Maybe and non-Maybe numbers.
I don't have a whole lot of Haskell experience ... but these "Maybe" unwrapping functions are rare right? Like a null check I'd think that most of the time you make the check when you get the unreliable input or allocation or whatever, and thereafter in the guts of you program you have the certainty that the input has been "checked".
The real trick is that, conceptually, these are not really "unwrapping" functions--instead, they're "propagating" functions. All they do is take Maybe values from deep inside your computation and string them through to the outside.
In practice, this means that you write a decent part of your program using these techniques, which creates a block of code that produces a Maybe value after taking a bunch of Maybe inputs. Then you only use a case statement at the very end, when you need to plug the Maybe value back into normal code.
All these functions are useful for one particular case: you don't know what to do with a Nothing value, so if you see one anywhere, you just pass it on: your final result is Nothing. That pattern just turns out to be very useful.
Not quite. They are not restricted to error handling. You can have methods returning a Maybe something without it being an error (just like there are plenty of valid reasons for returning null instead of throwing an exception). You can also use Maybe in a data structure:
data Employee = Employee { name : Text, spouse : Maybe Text }
You may also want to use Either to store two possible outcomes of an operation, though I would recommend using your own sum type for clarity:
If Nothing is ever an error, than you can add code to handle that case. It isn't really different from some method returning an empty list and other functions being basically no-ops afterward.
Depends on the structure of your program. When the Maybe represents the return value of some function that might fail, like a "null" in another language, you'll probably try to eliminate the Nothing case fairly soon after getting it on, and if you have several to eliminate you might use the monad syntax to avoid repeated checks. (User input will commonly use an Either or Error rather than a Maybe, so that you have some error status for what went wrong, but the same principle applies.)
However, sometimes a Maybe represents an inherently "optional" part of your data model, such as "a Foo may have zero or one Bar". In that case, you'll probably hold onto the Maybe until the point where you'd actually read and use that field.
Yes, that's the great thing about Maybe — any code that doesn't need to care about the uncertainty is freed up from worrying about nulls by guaranteeing it won't get one. Most of your functions will usually deal with the unwrapped type, so if you try to pass the Maybe to these functions, the type-checker will say, "Wait, this function isn't expecting a Maybe. You've done something wrong here." So you have to do your "null check" at the point where you get your Maybe, and then you know for certain that the rest of your code won't explode with a NullPointerException or whatever.
Yes. This is perhaps the most important point in this thread. Monads and idioms are neat, but for Maybe's in most situations they are not really necessary since there is often a single point where you `case` on a Mabye and that's it.
Sounds like a bad idea to me. Even though Num doesn't have an explicit contract, we sort of expect it to behave "nicely".
But here we start with a nice ring like Integer and end up with a type that has this weird, extra element that has no inverse with respect to addition, etc.
At least in Clojure, it's considered bad form to extend your own protocols (aka instance type classes) to types you don't own. Isn't that also true of Haskell?
Haskell's type classes, unlike ML functors (I think), are "coherent", which means that you can't have scoped or multiple instances for the same type, lest you risk breaking the type system. With that in mind, extending Maybe to Num would mean a reduction the number of type errors caught in other code that uses Maybe and Num near each other.
There are occasions where it can be useful to have a module export an orphan instance for compatibility before it makes it into the more appropriate spot in the standard libs.
And you're right you can't have multiple instances for the same type; a newtype wrapper is required.
That particular syntax makes much more sense when you have functions that return a Maybe value:
do a <- getA "foo" "bar"
b <- getB "foo" a
...
I used a deliberately overly simple example so I could go on from do-notation--which many people are already familiar with--to applicatives and idiom brackets.
Besides, it looks like any normal program, except you're using <- to define variables rather than =. Can't see how it could be any clearer than that.
Except that what it looks like isn't actually what it's doing (made even clearer by the a <- a example).
I like the Maybe concept and non-nullable types; I just think being able to overload operators like "=" and ";" in C++ and Haskell is optimizing writability over readability and in most cases, readability is by far the more important attribute.
You can't overload = or ; in Haskell. What you can do is to easily write code over some sort of 'boxed' values, where the type of the box is given by type signatures, and more importantly is almost always quite clear from context.
In this case, think of Maybe as a box containing one or zero instances of a type. For Maybe
do x <- maybeAnInt
y <- maybeAnotherInt
return (x+y)
Lists are 'boxed' values containing any number of elements
do x <- [1,2]
y <- [10,20]
return (x+y)
The monadic semantics for lists means this returns the sum of each combination of values, namely [11,21,12,22]. This is essentially like a database join, which is why a monadic structure was used for LINQ.
For Promises
do x <- intPromise
y <- anotherIntPromise
return (x+y)
This returns a new promise containing the sum of the result of two promises instead of using callbacks.
Essentially all these types live in a Maybe/List/Promise box. There are many more examples. The nice thing about monads is that the semantics of how these things work is abstract enough to allow a variety of interpretations, but constrained enough (by the Monad laws) that you get a nice intuition of how things work after using a few different instances.
But you're quite clearly NOT overloading "=" or ";". Instead of saying
do
a = getFoo x y z
b = getBar p q
(which won't compile), you're saying
do
a <- getFoo x y z
b <- getBar p q
Which is not an overloaded operator. In fact, it's not even meant to suggest equals (which in Haskell rather strictly means mathematical equality) but rather assignment ("=" in the expression "x = x + 1;").
The use of Haskell here was just for illustration; there's nothing stopping a programming language designer from having clear, convenient syntax for dealing with Maybe values.
can result in Nothing if either a or b is Nothing, but
(fromMaybe 0 a) + (fromMaybe 0 b)
will always result in a number, with 0 being used in place of Nothing. If only one of them is Nothing, you'll get the value of the other, the Nothing having been treated as 0.
You did misunderstand: you took a friendly illustration of an alternative way to use Maybe as some kind of absolutist degree. Well, put down your sword. Much of the time when you have Maybe you also have a sensible default value and you want to use that rather than live the rest of your life inside Maybe. fromMaybe is just a handy way to deal with that case.
So maybe this is just a terminology thing, but, isn't Maybe the same thing as:
1. By default all types are non-nullable.
2. You explicitly mark if you expect that a value could be null.
3. The compiler helps you by either making sure you check for null on those values you mark, or by doing some magic (like Scala does) that will always return null for expressions where one of the values is actually null.
Is the reason we call it Maybe/Option/whatever just to disambiguate with the traditional use and lack of safety in what pretty much all languages use "null" for? Or is there a distinction I'm missing?
Yes, it's the same idea. The main difference is that Maybe is just a normal type in languages like Haskell and OCaml--you do not need any especial compiler support for it.
So to use Maybe, all you need from the compiler is to not have nulls everywhere. Since you don't need language support for it, your language is simpler and the Maybe behavior is part of a library.
This also ensures that you Maybe values behave as first-class citizens. You can do anything with the Maybe type that you could with any other type, because that's all it is. For example, this means that you can nest them: have a Maybe<Maybe<A>> value, for example. It also means Maybe can play well with other libraries; for example, inn Haskell, it works immediately with the alternation operator:
result = tryA "foo" <|> tryA "bar" <|> tryB
Part of the beauty is that <|> is an operator that represents alternation for a whole bunch of other types as well. There are a whole bunch of other functions like this.
So: yes, you can have language support for it. But just having it as a normal type makes the language simpler and ensures you have full generality. The only thing that you need from your language is to get rid of null.
You DO need special compiler support for Maybe! Specifically, you need algebraic types. Consider Java. There is no pattern matching and no syntactic construct to decompose algebraic types. So an Option has to be cast to Some before extracting its value, risking a ClassCastException if the Option was actually a None. This is just as bad as having nulls, since programmers will just cast without looking.
My point was that you do not need to support Maybe explicitly. Instead, it naturally flows from significantly more general language features; Maybe itself is just a normal type built with these features.
Following your argument, we need special language support for any sort of abstraction, because everything has to be built in terms of some built-in language features at some point.
Also, on a largely unrelated note, I think that there is no reason for modern languages not to have sum types in this day and age. (cough Golang cough)
I've been working with engineers writing production Haskell for the first time; its lack of a "null" concept is proving to be absolutely incredible. In a high-uptime production environment, it's at least as valuable as Haskell's enshrinement of purity.
We have a convention that all recoverable errors be captured in Either or Maybe. This policy is paying off in a huge way because the type system forces us to think about what error cases mean. This is driving us to write substantially higher-quality code than I've written with other tools.
Another story is that you may end up with something like:
Just x -> something x
Nothing -> halt_and_catch_fire # impossible
In Haskell there is even a standard library function fromJust which does exactly that.
One introduces this hack knowing that this particular variable will always be Just and having absolutely no way to deal with Nothing and later somebody else sees the type Maybe Foo and figures that it must be OK to put a Nothing in there.
Well, assuming "halt_and_catch_fire" is something like error foo or undefined, you're essentially going out of your way to circumvent the type system. It's possible to do it, but it's frowned upon.
I can't believe there is no mention of Ceylon here. Ceylon has an elegant approach to this using union types.
Typesafe null and flow-dependent typing
There's no NullPointerException in Ceylon, nor anything similar. Ceylon
requires us to be explicit when we declare a value that might be null,
or a function that might return null. For example, if name might be null,
we must declare it like this:
String? name = ...
Which is actually just an abbreviation for:
String|Null name = ...
An attribute of type String? might refer to an actual instance of String,
or it might refer to the value null (the only instance of the class Null).
So Ceylon won't let us do anything useful with a value of type String?
without first checking that it isn't null using the special if (exists ...)
construct.
void hello(String? name) {
if (exists name) {
print("Hello, ``name``!");
}
else {
print("Hello, world!");
}
}
That is exactly equivalent to Haskell's Maybe, except I don't know if it's possible to abstract over type constructors in Ceylon to implement eg. Functor and Monad.
I think the largest hurdle I have getting running with maybe, is that it just isn't in my fingers yet. As such, it can really slow you down if you were trying to build a bare bones prototype without layering. Which, given the popularity of dynamic languages, I would think is fairly common.
That is, if you are going to use Maybe, you either want to do so from the beginning, or you want a good layer of abstraction between where the value is optional and where it is not. Moving something from guaranteed to optional is a bit more combersome with Maybe.
Also, I have gotten really used to "truthy" values.
> Grōōvy isn’t really about statically enforcing things
They added a statically-compiled mode to Grōōvy last year. Most users like Grails don't use it yet, but it's there to try out and you can report any problems to the Grōōvy issue tracker.
How many problems are actually caused by null (in particular, how many billions of dollars?)
While pointers which point into the wrong place for various reasons (off end of array, previously freed memory) cause horrible issues to this day, I can't personally remember ever having a serious issue with a null pointer (they tend to crash quickly and loudly, because in all modern OSes dereferencing NULL segfaults)
The problem isn't just NULL in C. This post is talking about the entire Null/Nil reference problem across all languages that use a null-type value.
This is especially a problem in dynamic languages that sling nils around....like any major modern scripting language. Checking if a value is nil before proceeding is aping what a language like Haskell does when it pattern matches against Maybe (Just a, Nothing) albeit in a post-facto bad way. Granted, you can't really make any assertions about reflecting a maybe value in the type of a language that doesn't care about types before runtime.
That's the main problem with calling Maybe 'better'. If you don't have static type checking then either you implement it just for Maybe or you end up with the same sorts of problems as you had before.
Making types non-Nullable by default is nice, though. Even in a dynamic language you can have a syntactic distinction to make interfaces more explicit.
I actually started designing statically typed Python, since it seemed a fun exercise. Halfway in I realised I was effectively redesigning Haskell with slightly different syntax and stopped.
Not sure that's so far from Haskell, really... :-P
Maybe it's my lack of experience talking. Haskell feels heavier in a lot of places. I think the focus on compilation in the backend is almost a downside here, too.
Can it be in isolated places? That's doable...
Depends on the application. There's a lot of hardware out there that does really weird stuff. There's some interesting work in Haskell-space (eg Atom) but I don't know how comprehensively mature it is.
In some scenarios, this is a serious issue all by itself. My day-to-day work is mostly on Android, and eliminating nullable references altogether would eliminate some crashes, which are highly visible to the user.
I've worked with a code base in C++ where the code base would accumulate significant amounts of
if (argumentX == null)
return null;
at the top of function signatures. It was just defensive programming. Maybe argumentX couldn't be null, but it would take time to figure that out (sometimes I did that, though). More code means harder to read and maintain, thus costing dollars.
This would also be contagious: If a piece of code checks if X is null, you'll assume that X can be null, whether or not that's true.
I'd certainly prefer to be able to reason about the code with the safe assumption that certain things cannot be null.
I'd certainly prefer to be able to reason about the code with the safe assumption that certain things cannot be null.
You can do this, but if you want it to be maintainable, you'll also want to detail in the function comments this technical debt. If you don't, someone else will come along and see your sweet method (looking only at the comments) and use it where the input can be null.
Ex:
/**
* This does some stuff.
* @param entry Does something with this
* DEBT: Assumes the input entry is not null.
*/
void doSomething(SomeObject entry) { }
Why just put it in the documentation? Documentation is liable to drift from implementation, and AFAIK no compiler or runtime verifies the accuracy of comments. I'd feel much better about adding asserts to the original code, leaving it for a few generations of testing and exposure, and then eventually remove the conditionals. The assert calls then function as executable documentation.
Depending on your user-base (i.e. if you distribute headers to other developers with precompiled code), the documentation may be necessary on its own... but it's much weaker than an assert.
Java has supported @Nullable and @Nonnull for a while, it's pretty standard in good Java code these days, and the IDE's will perform static analysis to make sure you use these annotations consistently (e.g. warn you if you are checking a @Nonnull against null or if you are forgetting to test a @Nullable against null before dereferencing it).
I've worked on a lot of legacy Java code where these kind of null checks cause problems. The end up confusing things in corner cases.
You should verify arguments at a top level, then let the code underneath blow up with a NullPointerException if something unexpected happened. Your stack trace will point at where the problem lies.
I find for some reason that a lot of people want to use null instead of empty lists where you'll end up with this ugliness:
if (list != null) {
for (Thing t: list) {
processThing(t);
}
}
Instead of just passing in a Collections.emptyList() instead.
I agree with one exception: if you're storing the value passed down into a structure which will survive the call, then the value should be checked for null before being placed in the structure. This is because the eventual null pointer exception may occur long after it was put in the structure, obscuring how it got there.
Even if it can't be null now the calling code may change in the future. And did you check all the possible error conditions (e.g. failed allocation). If you are relying on a non-null call best to check every time.
I don't know about it in terms of money loss, but generally having missed checks caught at compile time rather than having the program crash is a good thing.
There used to be a whole class of Linux vulns involving mmap()ing memory at virtual address 0x0, filling it with fake kernel data structures containing some data value val and some pointer ptr and triggering NULL dereference in kernel code which was known to parse this structure and copy val to address pointed by ptr.
They had to "fix" it by blocking userspace memory mappings at the 0th page.
Great article! I'm especially glad to see "what have we gained?" in the FAQ, as it is probably the most frequently asked question I've heard.
OP - I noticed you were thorough enough to mention both Fantom and Kotlin in one section, so for the preceding section you might want to note that CoffeeScript also has a safe-invoke operator like Groovy's.
The null object pattern and Maybe/Option are different concepts. The most important point: A NOP for some type T exposes T's API directly, while Maybe/Option has its own API (and the type system enforces that Option[T] can't be treated as a T).
No. The idea is that you can get rid of null-the-value by having a Maybe type. They serve exactly the same purpose; having a null value in every type is like making each type implicitly wrapped in a Maybe.
The reason we can compare the type and a value is because they serve exactly the same purpose in different ways. The core argument of the article is that we should get rid of null because Maybe does the same thing in a safer way.
Pardon my confusion, but is null referring to that of Java-style reference types, in which case the article is really about those? The value itself seems irrelevant as null is practically equivalent to Nothing. Now, if comparing the types then I'd say that Haskell's Maybe has the following improvements over Java-style optional:
1. Opt-in
2. Not tied to reference types
3. Safe extraction/"dereferencing". ie case expressions rather than Java's implicit "Maybe t -> t"
All of which aren't restricted to a static type system.
I suppose you could write a sort of safe Maybe template class in c++. You could pass in two callbacks to handle Just and Nothing respectively. It would be really ugly, but should work, right?
In the title, the words "Maybe" and "Null" are values of the type "Strategies that a programming language can take for dealing with nullable values." There is no type mismatch; there is ambiguity in the syntax, but type inference clears it up.
making null fuzzy may save your program from a crash, but it also makes execution unpredictable. I would prefer a crash - which would lead to better localization and bug fixing - than indeterminate behaviour which is hard to debug and maintain.
There are a ton floating around, they just aren't terribly useful because every type is by default nullable. So even standard library functions can return null. You might be able to make your own code a little safer, but you still have to null check everything.
Null would mean whatever the programmer inteded it to mean. In the C world it sees to me it tends to mean "nothing", while in SQL it usually means "unknown".
Guava is definitely cool and I recommend it whenever I can. It's worth noting, however, that, as a Java library, it cannot provide the compile-time guarantees mentioned in the post. That is, your Option type could still be null and so you're still forced to perform run-time null checking. If you want the full benefit of Option types on the JVM, you might be better off with Scala.
Scala still has the problem that you can return null explicitly (since references on the JVM are nullable), but any Scala code which does this should probably be shot on sight.
The Optional<T> in Guava just forces one to explicitly check for null and then get the container object during development. A typical pattern would be:
Given an object a of type Optional<MyObject>, we write:
if (a.isPresent()){
MyObject o = a.get()
}
Of course, I could still do a.get() without evaluating isPresent() and end up with an java.lang.IllegalStateException. Here, we are "reminded" to do the null check.
Yes, you'd need a linter to stop you from using explicit null in your code. but than be layered on top of javac with something FindBugs and its friend @Nullable. It's not "Standard" Java, but it is compiler-time enforcement.
If you work on types that might be null, you're bloody careful or you're in the wrong profession. We're not calling incompetent programmers a gazillion dollar mistake, even though they probably have caused as much.
I would agree that the real world can bait one in the ass, out of the blue, just, what the, things are trying to eat my ass!
As for herding spherical cows and the vacuum, I'll do my best:
val maybeCow = Some(1) // SomeOne is the speherical mu (i.e. you)
maybeCow getOrElse 0 // Cow here becomes 1 with the universe
Were maybeCow initialized with a None value, then into the vacuum it goes...and out it comes with a safe 0 to keep order in our [application] universe.
I'm very amused. About 80% of my career has been in the "real world", so I have actually have some authority in calling this. No power in the world can subvert incompetence.
Look everyone, dscrd has solved the problem of software having bugs. Just be careful! Man, why didn't anyone think of that before? The software industry is going to be so much nicer now that you eliminated bugs.
If the database structure is encoded in your type system, and all your code that accesses the database (including the initial population of the database) is checked against that encoding, the compiler can absolutely statically check the correctness of code that relies on the structure and contents of the database. Several of the mainstream Haskell database packages do this.
Still, IMO this may be true in theory. In practice, most of the errors we encounter come from incomplete understanding of the input we're dealing with. These are many little rules that are very hard to formalize.
Incomplete understanding of input or domain is certainly a cause of problems, and I don't know that there is a way the type system can help in the general case.
With regard to databases in particular, though, it is comparatively easy to constrain things so that you don't have that problem, and enlisting the type system there makes sense. There are other areas that have useful approaches as well - consider protocol buffers, for instance.
I prefer to let the type system catch what it can, and use tests for where it can't - often type info can be leveraged to help in the testing, too: I recommend checking out quickcheck if you haven't.
Java is really not the greatest example here. In fact, the source article here is basically saying as much by calling into question some of the Java conventions.
I think you'll find in a language like Haskell that a lot more information can be embedded in types than in something like Java.
I think you should reserve judgement until you've worked with a more sophisticated type system than Java. I held the same opinion for quite some time but Scala and Haskell have showed me that much more is possible than I expected.
Still, on balance I prefer dynamic languages because I don't like to spend time defining and using type relations but I am no longer sure I really know which way is "better" for many types of tasks.
I worked with statically typed languages (mostly Java) for many years and just recently started working with dynamic languages (python and javascript with node).
I think it depends mostly on the task at hand... if you're on a small project with a small team, then probably the dynamic languages are more suitable. With larger teams, it starts to pay off to have more info encoded in the type system, so that everybody is on the same page.
For a small startup, trying to get a product out quickly, the dynamic paradigm is a game-changer. Not only in the programming language, also using nosql database was a game changer for me in terms of development speed.
Static languages like Haskell are the best of both worlds. You develop with roughly the same speed as dynamic languages, but then you get the performance and maintainability of static languages.
How much time do you spend instead on unit tests? Is it comparable? The idea with a Haskell (or more powerful) type system is to reduce the amount of unit tests you need to write (because the compiler enforces certain rules). It's all trade offs, but personally, I know I make mistakes, and I know I won't remember to check for all possible type errors in my unit checks. So I like that the compiler does it for me.