Avoid optional, nullable attributes with this one simple trick

(Clickbait title, comment if it worked below :P)

TLDR, summary

Optional, nullable attributes needlessly introduce state to your objects and needlessly makes your code more bug-prone and more complex. By default, optional, nullable attributes can and should be avoided. A very simple and idiomatic way to avoid them is to just take them out.

Longer version

How many times have you come across classes like this

NOTE: Overuse of String and Int for everything is a problem all of its own, primitive obsession, but that's a different topic altogether - the topic of this post is strictly about what attributes classes are composed of, not necessarily their types

class User {
  String userId;
  String username;
  String bestFriend;
  String favoriteBand;
}

where userId and username are the essential attributes that constitutes the core thing the class represents, and bestFriend and favoriteBand are optional, nullable attributes, and don't really have anything to do with what constitutes a user.

You've probably come across this more times than you can count, even in fresh code, right? Probably most classes in our codebases contain a mix of non-nullable and nullable attributes. (or at least, a very significant share do)

Why it's bad

Because any given object can now be in the state of 1) having all attributes 2) all but bestFriend 3) all but favoriteband 4) just userId and username.

That's 4 (!) states! Vs just the 1 possible state if it always had all attributes.

Note how only 2 optional attributes added 4 states - the range of states scales exponentially the more optional attributes there are.

Aside from the mere conceptual mental complexity, now, very real errors will result from doing things like

listOfFriends.add(someUser.getBestFriend()); //java.lang.UnsupportedOperationException

So now, every part of the code that deals with these objects will have to take state into account and deal with the possibly absent values, eg

if (someUser.getBestFriend() != null) { listOfFriends.add(someUser.getBestFriend()); } else { //do nothing }

Worse, the compiler won't warn about the problem, it will just result in runtime errors in production if not dealt with.

Also, conceptually, when you eg want the best friend of someone, you want the best friend object, not the user (let alone all details about that user) who considers that person their best friend!

The solution

Optional, nullable attributes can very simply and idiomatically be completely eliminated by just moving them out

class User {
  String userId;
  String username;
}

they could be turned into their own thing

class BestFriend { 
  //...however you wanna implement
}
 
class BestFriendService { 
  Optional<User> findBestFriendOf(String userId);
}

class FavoriteBand { 
  String userId; 
  String bandId; //or String bandName or whatever, however you wanna implement
}
 
class FavoriteBandService { 
  Optional<User> findBestFriendOf(String userId);
}

Note that this isn't some "clever" solution, and it doesn't rely on annotations, Optional, Kotlin nullable attributes etc etc - we're literally just moving the attributes out to be their own thing - which they are! They don't belong on the user object!

Common arguments

These are just some of the most common ones I see, but I'm sure more could be thought of.

Leaving the attributes on the User and using Kotlin Types? / Optional / @NotNull and @Nullable solves it

No, it doesn't. All they do is make the compiler warn and prevent things like

listOfFriends.add(someUser.getBestFriend()); //java.lang.UnsupportedOperationException

which is great! But the fundamental problem of state and complexity (both conceptual and in concrete code) is still there (ie when working with the object we still have to do if/map or whatever), and the class remains poorly designed, with attributes that don't really have anything to do with what constitutes a user.

Code review will catch any resulting bugs

Maybe! By all means, you could carefully code review to make sure there are no bugs resulting. But... why not just not code in a way that completely avoids the issue, greatly simplifiying the code review, so you won't have to (hopefully) catch them in code review?

What about making the attributes non-optional with default values?

Sometimes, this is an option. (pun intended) But who are you going to set as bestFriend, and what band are you going to set as favoriteBand? It's not always possible to set a default - many times, defaults are just set to shoddy non-values like "NA" or "unknown" or whatever just to please the compiler, when, really, that's a strong smell that something isn't right and there's an underlying issue that should be fixed instead. (ie redesigning the classes)

Sometimes attributes really are optional, nullable / sometimes optional, nullable attributes are useful, practical etc

Sure! Despite what it may seem, I’m actually not making some fundamentalist argument that absolutely no class must ever contain optional, nullable attributes. I'm simply saying, rather than by default casually and liberally adding optional, nullable attributes to classes, instead, by default, try to keep classes free of optional, nullable attributes as much as possible, and only introduce them when truly appropriate.

What if I always need to load users including their best friend and favorite band

In that case, by all means maybe splitting things up like this is pointless. But even then, tbh, I would still keep things separate, because it's just good class design. This may seem like a contrived and trivial example, but if you always casually and liberally just keep adding arguments to the first available class, without taking the time to ask yourself if that's really where they belong, that's how you end up with classes like this (and while this is an extreme example of a terribly, terribly designed class, classes like this are not uncommon (Note, most of these attributes will be null) (Note, this is not me being facetious, this is an actual class that was once widely used until we split it up into more cohesive classes)

public class User implements Serializable, IUser {
    private static final long serialVersionUID = -5174074337721179065L;
    private long id = 0;
    private String firstName;
    private String middleName;
    private String lastName;
    private Byte personalTitle;
    private Byte passengerType;
    private String emailAddress;
    private Byte promoMailOKBool;
    private Short group;
    private Byte VIPBool;
    private Byte throttleBool;
    private Byte tieBehaviourOKBool;
    private Byte thirdPartyPromoMailOKBool;
    private Byte role;
    private Short roleWithShort;
    private String placeCodeDefault;
    private String roleArray;
    private Short langID;
    private String anotherRoleArray;
    private Short langIDTrvlr;
    private Integer maxRowCount;
    private Byte statusID;
    private PhoneNumber phoneNumber;
    private PhoneNumber mobilePhoneNumber;
    private PhoneNumber faxNumber;
    private ScoreUserRole scoreUserRole;
    private Calendar creationTimestamp;
}

Misc

Benefits the database too

Since class design often maps closely to underlying persistence, the proposed solution will have the same added benefit to your persistence layer, whether a relational database schema design, or you store your objects as JSON or what have you, eg instead of your tables looking like this

mysql> select * from user;
| user_id | username | best_friend_user_id | favorite_band |
| xxx | foo | null | null |
| yyy | bar | zzz | null |
| zzz | smurf | yyy | The Smurfs |

they will look like this

mysql> select * from user;
| user_id | username |
| xxx     | foo      |
| yyy     | bar      |
| zzz     | smurf    |

and

mysql> select * from best_friends;
| first_user | second_user |
| xxx | xxx |
...

mysql> select * from favorite_bands;
| user_id | favorite_band |
| xxx | The Smurfs |
...

Benefits infrastructure and performance too

I don't know why, but often, you'll find that optional, nullable attributes are fundamentally different in ways that relate to their

nature (eg they often turn out to constitute relationships between things rather than things themselves, which is kind of true for both bestFriend and favoriteBand)
"load profile" (ie while eg a user object userId and username only needs to be loaded once per day, and changed even more rarely, if ever, bestFriend and favoriteBand might change far more often and turn out to have completely different caching ttl etc etc. It kind of sucks to have to needlessly add traffic to the user table, reloading and storing the whole user into caches etc when really you're just trying to keep up to date with bestFriend and favoriteBand!

so splitting things up helps with stuff like that too, especially when operating at scale like we do.

Where have I seen that before?

This post is kind of an excerpt of https://medium.com/expedia-group-tech/database-pointers-73e476f1e687 that I decided to put into its own, dedicated post.

androidfred/optional_nullable_attributes.md