Sending null to /dev/null
In a recent talk at QCon labeled Null References: The Billion Dollar Mistake
1, Sir Charles Antony Richard Hoare himself – the inventor of Null (and QuickSort, and many other things that shaped our industry) states that Null was/is a bad idea.
What is Null?
Here is an explanation from wikipedia.
Null is a special pointer value (or other kind of object reference) used to signify that a pointer intentionally does not point to (or refer to) an object.
From an object oriented language point of view Null is a a value, whose type is a subclass of every other type in the system. It’s always at the bottom of the hierarchy. Because of this and the Liskov substitution principle Null can be used everywhere normally other type would be used. Have a method, that is marked to return string? It can return Null.
What is the problem with Null
The problem with it is that every method can return Null. And you, as a consumer of that method, need to check every time if the return value is Null. If you don’t – you (usually) get a Null pointer exception and your program crashes.
Now, exceptions are a very good tool – but particularly the Null pointer exception is a bad thing. Why? Because if it bubbles more than one level up it’s a sign of a leaky abstraction. Let me back up this claim:
First, what does a Null pointer exception mean? Let’s have a look at the documentation of it in Java:
Thrown when an application attempts to use null in a case where an object is required. These include:
- Calling the instance method of a null object.
- Accessing or modifying the field of a null object.
- Taking the length of null as if it were an array.
- Accessing or modifying the slots of null as if it were an array.
- Throwing null as if it were a Throwable value.
Basically it says that an object is expected, but no such is provided. Well, this is an implementation detail. In your business logic I hardly believe that you will be talking about references and such. We usually talk about real world object like people, accounts, etc. Any method that throws a Null pointer exception unveils parts of its implementation.
How many times has happened to you to forget to fill a field in some form and get a Null pointer exception.
I forgot to write my email – what does that Null thing has to do with it?!?
A method should never throw Null pointer exceptions (I’ve seen people doing it) – an Illegal argument exception is a much better one. Most of the time the underlying platform is the one throwing Null pointer exceptions.
It’s just a convention, but Null pointer exceptions should be extinguished when sighted.
So…
How can we handle Null
Take 1
The most straightforward and wide spread solution is to tell everyone to check the values for Null references. But that is a blacklist approach and, for this kind of problem, it doesn’t work. People forget stuff – and sooner or later someone will forget to do it and your 24/7 program will crash at 3AM on Sunday. Unit testing can help a lot here, but it requires work and you still cannot be 100% sure.
Can we do better?
Take 2
I would like to point out that the following method makes more sense in a statically typed language, where you have a compiler, checking your types for validity.
Most of the time we return Null as a value from a method that we want to return nothing, but some return value is expected and we throw Null at it.
Let’s have a look at the following Java code:
public String tld(String domain) { int dot = domain.lastIndexOf('.'); if (dot == -1 || dot == domain.length()) { return null; } else { return domain.substring(dot + 1); } }
The method above returns the top level domain part, or null if none found.
The “right” user code of this method is like this:
String d = tld("example.org"); if (d != null) { // Save to database } else { // Show the user a message }
But, as I pointed out earlier – there will be time that you will forget to make the check and the user will see a nasty exception, leaving him clueless about what she did wrong.
What we can do instead is signify our intentions that we might not return a value.
In Scala, we have the Option class – heavily inspired by Haskell‘s Maybe monad.
The Option class has two (final) subclasses2: Some and None. You return Some when you have something to return and None when you don’t.
Let’s look at the tld method implemented in Scala using the Option class:
def tld(domain: String): Option[String] = { val dot = domain lastIndexOf '.' if (dot == -1 || dot == domain.length) None else Some(domain.substring(dot + 1)) }
And the usage if it:
val com: String = tld("example.com").getOrElse("unknown")
That line above will return either the tld part the domain or "unknown" if none is found. Concise, isn’t it?
If we omit the getOrElse part the compiler with complain the we are not giving it a String value, but None instead.
// Does not compile! val com: String = tld("example.com")
That way we are able to catch the error early (say 17:00 o’clock on Friday which is much better than 3:00 in the morning on Sunday).
What we’ve done? We told the type system that we might return a value, but not necessary. Specifying the return type of tld to be Option[String] instead of just String does just that.
So… Can we do that in Java?
Let’s give it a try.
We’ll have a JOption interface with JSome and JNone implementing classes:
public interface JOption<T> { public boolean isEmpty(); public T getOrElse(T other); }
The JNone class will mark the lack of value.
public class JNone<T> implements JOption<T> { public JNone() {} public boolean isEmpty() { return true; } public T getOrElse(T other) { return other; } }
and the JSome class – the presence of it.
public class JSome<T> implements JOption<T> { private T value; public JSome(T value) { this.value = value; } public boolean isEmpty() { return false; } public T getOrElse(T other) { return value; } }
And now our revised tld method, using the new JOption machinery:
public JOption<String> tld(String domain) { int dot = domain.lastIndexOf('.'); if (dot == -1 || dot == domain.length()) { return new JNone<String>(); } else { return new JSome<String>(domain.substring(dot + 1)); } }
And using it becomes:
String com = tld("example").getOrElse("unknown");
The tld method isn’t longer than our original Null-returning version, but now the compiler will help us, because if we write:
// Does not even compile String com = tld("example");
we’ll get a compiler error.
Final thoughts
The proposed method probably isn’t the best and can be improved3, but it illustrates the concept.
Also, while this solution might help, the Null value still exists and people canwill (ab)use it.
Scala has a Null value as well – for compatibility with Java.
On the bright side – Haskell, Erlang and other functional languages don’t have anything like Null in them and use some form of the described approach.
But on the dark side we have Javascript – there you have undefined, null and NaN that adds even more codding horror. F# is in similar state.
Please, share your thoughts bellow.
__________________
1 On 25 August 2009 InfoQ published the video.
2 To be precise, None is an object (a singleton in Scala).
3 We can also add other methods to JOption like get() that will throw an exception in the case of JNone and return the value for JSome, as well as a proper equals implementation.
“Sending null to /dev/null”