I’m trying to find my NaN generator. I think it is due to an uninitialized input variable without proper protection in the code. I know the debugger doesn’t have a nan break and it is very difficult for me to get to the spot.
My thought was, can I somehow override the base./ function call to do a 0.0/0.0 check and alert me with a stack trace of the calling function or just throw an error and die right there with a stack trace.
I thought I could just override Base./ at the top level, but do I need to do this inside every package that uses math? How best to do this?
You are welcome to NaNcatcher.jl, an experiment of mine. There is some description within the file. If it seems helpful and you have questions … I am around.
I have a half-completed attempt at using a Cassette pass to find the first NaN that gets returned from any function. Needs some more work, but it found the NaN I was hunting for and then I moved on:
I ended up brute forcing it and searching for all of the divides for 0/0 protection and using @info statements to bark about NaNs and working my way through the call tree to find where things went wrong. Running in the interactive debugger is too cumbersome because I can’t set breakpoints multi-levels down in “accelerated/compiled” code. I think it was uninitialized parameters from my namelist reader. The defaults I used in some missed C++ code conversions were set to 0.0 instead of 1.0 for things like “factor”. I think if I could figure out how to overload the divide to check for 0/0 I could dump a stack when it happens.
This may help, if nothing is precompiled or handed to C and you work in the REPL, setting this first then including other functions. I have not tried this strategy – so here’s hoping.