`ENV["JULIA_DEBUG"] == "MyModule" && expensive_debug_method()` good practice?

Will Julia compile away lines like

ENV["JULIA_DEBUG"] == "MyModule" && expensive_debug_method()

when the environment variable is not set, or does the dictionary lookup happen at runtime? If the latter, is there a good way to implement nontrivial debug logic that will only be compiled in if ENV["JULIA_DEBUG"] == "MyModule"? Or is my question poorly posed?

The use case is as follows. Our code implements a custom constrained optimization solver for a particular problem. We already have a bunch of @debug messages that print convergence information at each iteration, and I see in the Logging section of the documentation that this has minimal performance impact because “the associated string processing will never be executed unless debug logging is enabled.”

Now, I want to add a debug-mode-only method that does some more expensive verification of the solver convergence (compute dual variables and checks KKT conditions). But if I write ENV["JULIA_DEBUG"] == "MyModule" && check_kkt(), then I want Julia to forgo compiling the check_kkt() method unless and until someone enables debug mode (i.e. never in production).
Is this something I should care about, and if so, is the implementation correct?

Similarly, can I expect the following not to be compiled if debug is disabled?

if rand() < 0.5
    @debug "Small"
end

The dictionary lookup happens at runtime & check_kkt will be compiled (but not necessarily run, which is the expensive bit).

How should the compiler compile away a call that modifies the RNG state? rand() must be called, since it’s not a pure function. The output may not necessarily happen, but the debug string will likely still be constructed.


You mention printing convergence information - how much overhead is concretely introduced by the logging, and does that information really need to be printed at every iteration?

1 Like

Good point. That was a poorly chosen example. I should have written:

expensive_but_pure_function() && @debug "foo"

In theory (but it sounds like not in practice?) the compiler could skip computing the function when debug is disabled here.

A lot. Usually what I do is turn on debug mode and run julia --project sample_problems.jl 2> log.txt. This yields a ~50 MB log file that I then process with regex to produce statistics on how many iterations it takes to solve typical instances. It’s sort of a crude, but effective “profiler” that uses the abstractions of “number of iterations” and convergence measures instead of raw function calls.

That’s not how compilers work. It’s not a compile-time known thing whether @debug will need to execute or not, since that too is a runtime switch. The JULIA_DEBUG environment variable merely provides an interface to change the active logging level from outside the running program, by controlling a global variable. For example, it’s possible to switch the global logger at runtime using global_logger. As such, expensive_but_pure_function() will be executed regardless of whether @debug will print a value or not, because deciding whether to log or not depends on external state.

That’s not really a quantifiable amount - are we talking like a 2x slowdown? 3x? How long does one iteration take?

It sounds like it would be easier, more performant & more memory efficient to keep that information around in a e.g. memory mapped array, instead of using logging to keep the information. That would also simplify your analysis pipeline because you skip having to regex the information you want out of the text file. In addition, you can then have a global const KEEP_STATISTICS = true that you can then switch to false to avoid recording any data. Since that’s a global constant, switches like KEEP_STATISTICS && expensive_but_pure_function() compile away completely.

For logging in particular, one feature of the log macros is that they are lazy, so in

@debug "hello" foo=expensive()

then expensive() won’t be called unless debug logging is enabled. But this isn’t really any better than

if debug
     foo = expensive()
    @debug "hello" foo
end

both should be reasonably fast (just checking a bool value or if debug logging is enabled), but are not compile-time checks.

1 Like

OK, so the problem (as often) is with my mental model. I think of environment variables as “belonging” to the OS and therefore a kind of constant (in the sense that they shouldn’t change while the Julia process is running, just like any const), but as @Sukera says you can enable/disable debug dynamically using built-in methods (or just by setting ENV["JULIA_DEBUG"]). Actually, the Julia docs speak to this and mention which environment variables behave like constants and which you can change at runtime: Environment Variables · The Julia Language

I haven’t implemented any “expensive” debug logic yet; everything currently in the program is stuff like

if converged()
    @debug "Converged in $current_iteration iterations"
    return x_optimal
end

where the call to converged() has to run no matter what. What slows things down are printing the debug statements themselves–e.g. the test suite goes from 20 seconds to 80 seconds when it is trying to print all that stuff out. Redirecting stderr substantially reduces the runtime when debugging (although not all the way to 20 seconds, because of the file output overhead).

I see the merit in the const KEEP_STATISTICS = true approach, but I have to admit that I am partial to my current solution because it single-sources whether or not we are in debug mode, and there is no need to remember to change the variable back before you commit files to version control.

Actually, after reading your suggestion, my first inclination was to write

const KEEP_STATISTICS = get(ENV, "JULIA_DEBUG", "") == "MyModule")

which would let me do KEEP_STATISTICS && expensive_but_pure_function() (the function doesn’t even have to be pure in this case, does it?). But I concede that the performance gained by writing this instead of just

debug() && expensive_function()

where

debug() = get(ENV, "JULIA_DEBUG", "") == "MyModule"

as @ericphanson suggests is probably pretty marginal.