Use C# 8 nullable reference type in docfx 3

Docfx 2 has long been bothered by this billion dollar mistake. There are crashes caused by NullReferenceException every now and then. In the early days of docfx 3, when we were discussing the engineering guidelines, C# team is already prototyping the idea of nullable reference types (or non-nullable reference types). Known that NullReferenceException is a big pain and that there is likely a language feature to prevent it, I sketched a strategy on null handling for docfx 3.

But it was until recently that we finally moved to .NET Core 3. Among all the goodies of .NET Core 3, the most exciting feature to me is nullable reference types. I’ve been experimenting with nullable reference types in docfx 3. Now the conversion is complete, the whole project is now “null safe”, it’s time to review the null strategy and see what worked and the caveats.

The null strategy

Docfx 3 is design to be a tool rather than a library from the beginning, all we ship is an executable. This avoids a bunch of problems like binary compatibility, API design, etc. It also makes null handling more convenient.

⚠️ The null strategy described here is specific to docfx, some of the principles may not apply to other projects.

Prefer non-nullable types

Whenever possible, use non-nullable types to save unnecessary null checks. Provide a default value for data models:

class Blog
{
  public string[] Tags = Array.Empty<string>();
}

Replace argument null checks with nullable reference type

In absence of nullable reference types, we use null default value to indicate that an argument may be null:

object ParseJson(string json, string sourcePath = null)

Now with nullable reference types, if a type can potentially be null, add nullable reference type modifier ?.

object ParseJson(string json, string? sourcePath = null);

Configure JSON deserialization to ignore nulls

With nullable reference types, you can mark a property type as non-nullable but still get null when the object is deserialized from JSON. The compiler does not check runtime variable assignment.

Most JSON libraries provide an option to ignore null assignment to your strongly typed classes. Like NullValueHandling.Ignore in Json.Net or JsonSerializerOptions.IgnoreNullValues Property in System.Text.Json.

This works for JSON scalars, but what about arrays and dictionaries?

Remove nulls in arrays and report a user warning

A user could write ["1","2",null,"3"] and bypass null check if the property type is string[]. This is considered user input error in docfx, we simply remove all null in arrays and report a warning before deserialization.

Mark dictionary value type as nullable

A user could also write {"a": null} and by pass null check if the object type is Dictionary<string, string>. We could use the same strategy as arrays by removing all null entries, but docfx requires some null values to be preserved, so using Dictionary<string, string?> as the data type is our current choice here.

Use immutable object model and constructors

Immutable object model has lots of other advantages, it also plays surprisingly well with nullable reference types. During the conversion, I had to convert some initialization only types to immutable types with constructors:

class Item
{
  public string Name { get; set; } // warning CS8618: Non-nullable property 'Name' is uninitialized. Consider declaring the property as nullable
}

is changed to

class Item
{
  public string Name { get; }

  public Item(string name) => Name = name;
}

Mark value type fields as nullable

Docfx uses value types a lot to improve performance. However, a non-nullable property in a value type can still be null if the value type is created using the default constructor:

struct SourcePosition
{
  public string FileName;
  public int Line;

  public SourcePosition(string fileName, int line)
  {
    FileName = fileName;
    Line = line;
  }
}

SourcePosition position = default; // position.FileName is null and there is no compiler warning.
SourcePosition[] positions = new SourcePosition[2]; // positions[0].FileName is null and there is no compiler warning.

It’s best to mark the above code as:

struct SourcePosition
{
  public int Line;
  public string? FilePath;
}

Conclusion

With the above strategy, if the whole project enables nullable check, the compiler can detect places that potentially throws NullReferenceException, and the codebase is null safe. Next time, I’ll talk about the caveats and false positives of nullable reference types in C# 8 and how to workaround them.