Nothing values in constraints

How to handle variables that have “null” value, in constraint expressions. For example in Excel when we have an expression a*b + c, and b = empty, then Excel will just return c. How can we handle such expressions in constraints for optimization.

Another example is MAX(a,b,c) abd b is empty then it should return MAX(a,c)?

Looking forward to your advice.

The best way to do this is case-by-case. Why is it best? Well, consider the two examples you gave. In the first example, a * b + c, an empty b should act like zero. Whereas in your second case, max(a, b, c), an empty b should NOT act like zero. So you want different behaviours.
The reality is that the “expected behaviour” in these cases is completely idiosyncratic, and if you gather ten people and ask them, you’ll get 11 opinions. For example, I would argue that in the a * b + c case, an empty b should act like one (the multiplicative identity), not zero. But arguing this is pointless, because different defaults makes sense in different situations.

Concretely, to handle it, you have two options:

  1. Encode the potential ‘emptiness’ into the type system. So, you could have b::Union{Float32, Nothing}, and then handle the nothing case explicitly, like this:
y = isnothing(b) ? c : a * b + c

This solution is less bug-prone and can be analyzed with static analysers. Though it can be a bit verbose.

  1. Pick a sentinel value of the same type as b, and check it. For example, you could pick typemin(typeof(b)) if b isa Integer, or NaN32 if b isa Float32. Then you would do:
y = isnan(b) ? c : a * b + c

This is usually more computationally efficient, because b now has a single concrete type, which enables more memory optimisations. However, this approach has a greater risk of bugs, because there is nothing in the type system that keeps you from forgetting that b may be empty.

Edit: Julia does have a built-in type for this, namely missing. But I strongly dislike it, and think missing is a huge misfeature. The problem is - you guessed it - that the behaviour of missing is usually inappropriate in a given situation, because the umbrella concept of ‘missing data’ is really 100 subtly different things that need to be handled differently, and assigning it the same type with the same default behaviour doesn’t make sense, and only serves to misguide the user. Furthermore, missing is particularly poorly designed for two reasons:

  1. It propagates across functions, meaning it causes bugs to appear far from where they originated. This is bad for debugging errors.
  2. By design, methods with missing violates the contracts of the function. For example, the documentation of isodd gives its signature as isodd(x::Number)::Bool, and yet we have isodd(missing)::Missing. This causes real life bugs, and also overly defense programming in response.

So don’t use missing.

2 Likes

Thanks for the response. But the issue I have further is , that either of a or b or c can be nothing.. so we will not be able to provide a generic way to handle nulls in expressions? In finance there is a explicit need for handling this kind of expressions so that we can not assume that all empty values are zeros. What would be an ideal solution in this case?

You can easily define your own null types. The issue is that there is no agreement on what nulls should do in different situations. But for your own applications, you can define a null type, and it’s behaviour just like you want it.

So you could do something like

struct Null end
const null = Null()

# define methods to control how it behaves
Base.:*(::Null, ::Number) = null
Base.:*(::Number, ::Null) = null
Base.:*(::Null, ::Null) = null

Base.:+(::Null, a::Number) = a
[ ... more methods ]

And then have your variables be of type Union{Float32, Null}.

The existing “out of the box” null types are nothing, whose default behavior is to error when used in functions, and missing, whose default behaviour is to return missing when used in functions. This may or may not be the behaviour you are looking for.

I will start by saying that this pattern is not one most users often find a need for. Personally, I’m not sure I ever have. There is probably a more ergonomic way to achieve what you’re after than solving it at this particular level. But I’ve provided some (not-entirely-lovely) options below anyway:

In some cases you can do things like maximum(z for z in (a,b,c) if !isnothing(z)) if encoding via nothing. If encoding with missing, you can do either that same pattern or the equivalent maximum(skipmissing((a,b,c))). In other cases you can do things like something(a, 0) * something(b, 0) + something(c, 0) to replace a nothing among the inputs with 0 (coalesce is the equivalent for missing).