Treating NaN as error: Helping debugging

I am writing a large hydrological model, and for some sets of parameters I get a NaN value, but I do not know which first function is giving me NaN. Therefore, I would like to stop my program at the first NaN.

A dirty debugging solution would be to test at every level isnan(…) which is tedious.

I am therefore wandering if there is an option in Julia to treat NaN as an error which will help to determine which function is causing the issue?

There is no option that tells Julia to treat NaN as an error. NaN arises from one of these (where the signs may be positive or negative):

Inf / Inf
0.0 / 0.0
Inf % x

and inv(0.0) == Inf; x / 0.0 == Inf

Look at the routines to see where you may be dividing by zero (or introducing Inf).

chk(x)  = (iszero(x) || isinf(x)) && error("x = $x")
function chk(x,y)
    (iszero(x) || isinf(x)) && error("x = $x")
    (iszero(y) || isinf(y)) && error("y = $y")
end
    
function fn(a, b)
    chk(a, b)
    return a / b
end
3 Likes

NaNs are valid Float64s, engineered precisely for the purpose of making invalid results propagate without errors.

Your best option may be placing a few checks that validate outputs (or better, inputs). I find isfinite useful for this purpose (it may catch a few things that turn to NaNs later). I agree that it can be a bit cumbersome, but it can quickly pinpoint a problem.

Hmm, Fortran compilers let you enable floating-point exceptions, e.g. gfortran -ffpe-trap=invalid will cause an exception once a NaN is created. I think technically this will cause a SIGFPE signal that can be caught by a signal handler. Would there be a way to do something similar in Julia, e.g. a library function to enable those exceptions and handling the signal will result in a backtrace?

Update: there’s an open issue on this, https://github.com/JuliaLang/julia/issues/27705

4 Likes

I wrote GitHub - jwscook/ElideableMacros.jl: Elidable macros in Julia, that can be compiled out as a way to try to understand macros (it turns out I didn’t master them). I’m fairly certain the hygiene is wrong, but it allowed me to create @elideableassert, which can be used with @elideableassert !isnan(x). The elision works by reading ENV["ELIDE_ASSERTS"]. This is relatively close enabling something like gfortran’s compiler options as pointed out by @traktofon.

Use with caution! It’s buggy.

2 Likes

NaN treated as an error

Is there any new features/options in Julia V1.6.4 where one can treat NaN as an error and therefore I can determine where in my code NaN is being produced?

Many thanks for any suggestions

It would be nice if Julia would support signaling NAN, but this seems to be an open issue. Not sure what LLVM does support here, though.

You might be interested in this old post:

2 Likes

We ran into similar problems with NaN popping up when we computed derivatives with ForwardDiff; finding the source was very difficult. We wrote this utility type to make it easier.

It probably doesn’t support every floating point operation but it was enough for our use. Hasn’t been tested under 1.7.

Call your functions with NanCheck instances instead of floats:

struct NaNCheck{T<:Real} <: Real
    val::T
    function NaNCheck{T}(a::S) where {T<:Real, S<:Real}
        @assert !(T <: NaNCheck)
        new{T}(T(a))
    end
end
export NaNCheck
Base.isnan(a::NaNCheck{T}) where{T} = isnan(a.val)
Base.isinf(a::NaNCheck{T}) where{T} = isinf(a.val)
Base.typemin(::Type{NaNCheck{T}}) where{T} = NaNCheck{T}(typemin(T))
Base.typemax(::Type{NaNCheck{T}}) where{T} = NaNCheck{T}(typemax(T))
Base.eps(::Type{NaNCheck{T}}) where {T} = NaNCheck{T}(eps(T))
Base.decompose(a::NaNCheck{T}) where {T} = Base.decompose(a.val)
Base.round(a::NaNCheck{T}, m::RoundingMode) where {T} = NaNCheck{T}(round(a.val, m))

struct NaNException <: Exception end

# (::Type{Float64})(a::NaNCheck{S}) where {S<:Real} = NaNCheck{Float64}(Float64(a.val))
(::Type{T})(a::NaNCheck{S}) where {T<:Integer,S<:Real} = T(a.val)
(::Type{NaNCheck{T}})(a::NaNCheck{S}) where {T<:Real,S<:Real} = NaNCheck{T}(T(a.val))
Base.promote_rule(::Type{NaNCheck{T}}, ::Type{T}) where {T<:Number} = NaNCheck{T}
Base.promote_rule(::Type{T}, ::Type{NaNCheck{T}}) where {T<:Number} = NaNCheck{T}
Base.promote_rule(::Type{S}, ::Type{NaNCheck{T}}) where {T<:Number, S<:Number} = NaNCheck{promote_type(T,S)}
Base.promote_rule(::Type{NaNCheck{T}}, ::Type{S}) where {T<:Number, S<:Number} = NaNCheck{promote_type(T,S)}
Base.promote_rule(::Type{NaNCheck{S}}, ::Type{NaNCheck{T}}) where {T<:Number, S<:Number} = NaNCheck{promote_type(T,S)}

for op = (:sin, :cos, :tan, :log, :exp, :sqrt, :abs, :-, :atan, :acos, :asin, :log1p, :floor, :ceil, :float)
    eval(quote
        function Base.$op(a::NaNCheck{T}) where{T}
            temp = NaNCheck{T}(Base.$op(a.val))
            if isnan(temp)
                throw(NaNException())
            end
            return temp
        end
    end)
end

for op = (:+, :-, :/, :*, :^, :atan)
    eval(quote
        function Base.$op(a::NaNCheck{T}, b::NaNCheck{T}) where{T}
            temp = NaNCheck{T}(Base.$op(a.val, b.val))
            if isnan(temp)
                throw(NaNException())
            end
            return temp
        end
    end)
end

for op =  (:<, :>, :<=, :>=, :(==), :isequal)
    eval(quote
        function Base.$op(a::NaNCheck{T}, b::NaNCheck{T}) where{T}
            temp = Base.$op(a.val, b.val)
            return temp
        end
    end)
end
5 Likes

The nansafe_mode of ForwardDiff.jl can help with debugging NaNs.

1 Like