Nulls

11 Feb 2025

I’m interested in SQL nulls now because I’d like to replace exceptions in a pure functional language with something like nulls, for the sake of referential transparency. I’m not going to talk about that in this post. I’m going to talk about SQL nulls and how they might be better.

SQL nulls mostly use a three-valued logic, of the “Kleene logic” variety. Since nulls might be considered “unknown” and they often don’t even meaningfully compare to normal values, they’re infectious…

3 + null == null
4 > null == null
true and null == null
not null == null

and more surprisingly:

(null == null) == null
(null != null) == null
(3 == null) == null

This is why SQL has a special syntax for is null. And of course 3VL breaks the principle of excluded middle: x or not x is no longer necessarily true.

In fact, the only cases in which nulls are not infectious (ideally) are where they wouldn’t matter:

(true or null) == true
(false and null) == false

SQL famously has some problems with nulls. Most practically, it’s very easy to unconsciously compare against nulls in where and on clauses, in which case rows will be silently dropped from the result. E.g.

select * from sometable where x == null
-- always return 0 results

Or more insidiously:

select * from sometable where x != y
-- omits rows in which y is null

So here are two solutions, perhaps complementary. I haven’t seen them proposed, but I have seen a lot of serious complicated proposals, so I’d love to hear what I’m missing. I’m not thinking at all of gently transitioning any real SQL, which could be a bit more challenging. Green fields forever!

When a query or inner join refers to a null valued field, the query should return nothing but a single row with all null fields (instead of the rows without null valued fields, and silently omitting the null results). This makes nulls in queries infectious, just like nulls are infectious in the scalar context. The same of course should apply to all aggregate functions. Force people to handle the nulls.
Static analysis could help a lot. From the schema, the type checker knows which fields are nullable, and could require explicitly handling that possibility. e.g.
```
  select * from t where nonnullX > 123 --fine
  select * from t where nullableY > 123 --doesn't compile!
  select * from t where nullableY//0 > 123 --allowed, 0 is used instead of null
```
I realize static type checking of SQL or similar is quite a can of worms. :)

RSS