NaN Detector

I’m trying to find my NaN generator. I think it is due to an uninitialized input variable without proper protection in the code. I know the debugger doesn’t have a nan break and it is very difficult for me to get to the spot.

My thought was, can I somehow override the base./ function call to do a 0.0/0.0 check and alert me with a stack trace of the calling function or just throw an error and die right there with a stack trace.

I thought I could just override Base./ at the top level, but do I need to do this inside every package that uses math? How best to do this?

Best Regards,
Allan Baker

Maybe you can use

https://github.com/JeffreySarnoff/SaferIntegers.jl

to spot your problem. It Helped me to catch silent overflows.

2 Likes

Thanks for the tip. I need it for floating point calculations though.

See this issue for discussion and some options: https://github.com/JuliaLang/julia/issues/27705

2 Likes

You are welcome to NaNcatcher.jl, an experiment of mine. There is some description within the file. If it seems helpful and you have questions … I am around.

5 Likes

If it’s NaNs in an AbstractArray, can you put a conditional breakpoint in the setindex! method for your specific array that’s receiving the NaNs?

1 Like

I have a half-completed attempt at using a Cassette pass to find the first NaN that gets returned from any function. Needs some more work, but it found the NaN I was hunting for and then I moved on:

https://github.com/mbauman/TheNaNTrap.jl/

6 Likes

I ended up brute forcing it and searching for all of the divides for 0/0 protection and using @info statements to bark about NaNs and working my way through the call tree to find where things went wrong. Running in the interactive debugger is too cumbersome because I can’t set breakpoints multi-levels down in “accelerated/compiled” code. I think it was uninitialized parameters from my namelist reader. The defaults I used in some missed C++ code conversions were set to 0.0 instead of 1.0 for things like “factor”. I think if I could figure out how to overload the divide to check for 0/0 I could dump a stack when it happens.

This may help, if nothing is precompiled or handed to C and you work in the REPL, setting this first then including other functions. I have not tried this strategy – so here’s hoping.

julia> (/)(x::Float64, y::Float64) =
    iszero(y) && iszero(x) ? "found it" : Base.:(/)(x,y)

julia> 1.0 / 0.0, 0.0 / 1.0
(Inf, 0.0)

julia> 0.0 / 0.0
"found it"
3 Likes

I was thinking this might work, but if I’m pulling in packages, do I need to go into each package and define this?

The easiest way to answer these kinds of questions is empiricism. Don’t hesitate to just try this out.

One session:

julia> Base.:(/)(x::Float64, y::Float64) = 1.0

julia> using Distributions

julia> pdf(Normal(0.0, 1.0), 5.0)
1.0

julia> pdf(Normal(0.0, 1.0), 10.0)
1.0

Another session:

julia> using Distributions

julia> pdf(Normal(0.0, 1.0), 5.0)
1.4867195147342977e-6

julia> pdf(Normal(0.0, 1.0), 10.0)
7.69459862670642e-23
6 Likes