A Post About Nothing, or, How I Learned To Stop Worrying And Love Nullable

Iain Holder has an interesting post about Nullable<bool> over on his blog, about the pitfalls of using nullable boolean values:

What does null mean? It's got to mean something. You're either pregnant or you're not. You can't have a third state. A light switch is either on or off. If a light switch doesn't exist then it's potentially very dangerous. Almost as dangerous as having a nullable bool.

nullable_field_warning

Personally, I like null; I think it’s a wonderfully useful and often-misunderstood concept. Null is not dangerous per se, but used correctly, it can be a very effective warning sign.

Null means we don’t know – or alternatively, that question doesn’t make sense in this context.

To run with Iain’s example for a moment, let’s write a patient tracking system for a hospital; every time a patient arrives, we store their medical details in a database – blood type, next of kin, whether they’re pregnant, that kind of thing – so that the doctors can recommend appropriate treatment. All fields are required – no nulls allowed. It’s nice, simple, saves time, saves lives.

All very well, until an ambulance brings in an unconscious woman who might need X-rays. Is she pregnant? She might be – you don’t know; she came in alone, and she’s unconscious. The clock is ticking… an X-ray might save her life, but if she’s pregnant, it could harm her unborn child. If our patient system makes us choose true/false with no third option, we’re potentially making a very dangerous assumption either way. We need a third option, so when a doctor gets that patient’s chart, they see that we don’t know, and they can make sure they do a pregnancy test before sending the patient for X-rays or administering potentially dangerous medication.

I’m not arguing that NULL is the only solution to this problem. The problem is universal, and nullable fields is just one of many possible solutions. It happens to be a solution that’s natively supported in most databases and platforms, and I personally think the semantics of “null = don’t know” are rather nice. You may disagree – but you still need something to indicate when data in your model is potentially inaccurate or missing.

Yes, null has no place in a perfect model…

If your model (whether it’s an OO domain model or a relational database model) is complete, perfect, accurate and consistent, then you’re laughing. You will never have null values because your model maps perfectly onto the problem domain you’re working in, and you know every detail of every  entity in that problem domain. You know every single customer’s date of birth; you have detailed records of the marketing preferences of every person in your database, and your model is so perfectly tuned to your business that there’s no sparse tables, no outer joins -

…but there’s no such thing as a perfect model

A model is an abstraction, and abstractions always leak. The real world isn’t domain-driven, or relational, or object-oriented – these paradigms are just ways of slicing and storing information about the real world that help us solve a problem or do a job. A lot of the time we’re making decisions based on the value of the information, but sometimes we need to make decisions based on whether that information is present or not.

A slightly less contrived real-life example might help. In one of the systems I work on, we have a nullable bit field in our main customer database for e-mail opt-in preferences. We interpret the true/false/null values as follows:

True This customer has agreed to received marketing information via e-mail
False This customer has chosen not to receive marketing information via e-mail
Null We have no record of this customer's preferences.

This is the second Google Image search result for "null". I have no idea what it is - something to do with magnetic topology - but I think it's quite beautiful. When we send out e-mail newsletters, we only include the customers whose field is actually true – that’s the point of opt-in marketing, right? Any customer can log in and change their preferences at any time, so the customers who have opted-out (i.e. their value is false), we leave them alone – they’ve said they don’t want to get hassled, so we don’t hassle them, and if they change their mind, they can log in and reactivate their newsletter any time.

So what about null? Well, when a customer logs in, they get a personalized welcome page. If we see that their opt-in field is null, we add a message to this page saying “Hey, we have this e-mail newsletter you can subscribe to - could you take a moment to let us know whether you’re interested?” The point is, as soon as they’ve expressed a preference one way or the other, we’ll stop showing them this message, so a customer should only see this message the first few times they’ve logged in. We end up with better data; they don’t get unwanted e-mail, and everyone’s happy.

The weird swirly picture there turned up in a Google Images search for “null” – I don’t know enough about magnetic topology to have the faintest idea what it is, but it’s quite beautiful, don’t you think? Image © Colin Beveridge