Using `isnan()` with missing values leads to hard to find bugs

I found some curious behavior when using the isnan()function on a dataframe with missing values the other day.

I was trying to replace NaN values with missing values in the dataframe. However, I found that when
I encountered isnan(missing) that this evaluated to missing instead of True or False. This was generating errors in the code that were hard to find, since the error message was related to the if-else function instead of the isnan() function.

I was just wondering if this was expected behavior. Seems a bit odd that a boolean function would return a non-boolean value like missing. But that could just be my simple mind’s way of thinking of things.

Anyhow, just wondering if this is really an issue I should submit, or if this is expected behavior? Thanks.

This is expected behavior. The value in question is missing, so we don’t know what the value is. It could be NaN, since we can’t know for sure, isnan returns missing.

Can i ask what language you are coming from? Different languages have different approaches to missing values.

3 Likes

@pdeffebach Okay cool, good to know. I am coming from Python and R background. It just caught me by surprise that a boolean would return a missing value, instead of True or False. But that is okay, now I know to check for missing values before using the isnan() or other similar functions.

You could do something like this

isnanmissing(x::Union{Number,Missing}) = ismissing(x) ? true : isnan(x)

which returns true if the number is missing or NaN

2 Likes

Yes it’s a bit more strict than R.

In R you use na.rm = T everywhere. The equivalent usage would be to pass skipmissing(x) to all your functions. Another solution is to do replace

julia> replace(x, NaN => 0, missing => 100)

Check out Missing.jl for convenience functions.

1 Like

Oh yes, that makes a lot more sense. I forgot that skipmissing() existed, so I can use that. Sounds good, thanks for setting me straight.

Oh great, yes this will work. Nice, I will add this function to my little code library for future use. I am slowly starting to get a handle on the julia idioms. Thanks again.