Debugger fails on pycall

I’m using Debugger.jl to debug an RL algorithm calling into OpenAIGym.jl using PyCall. Upon the call, the error

ERROR: sigatomic_end called in non-sigatomic region

is thrown. Is this a known limitation of the debugger or am I seeing a bug? Full error below

debug> ERROR: sigatomic_end called in non-sigatomic region
Stacktrace:
 [1] #evaluate_call_recurse!#37(::Bool, ::typeof(JuliaInterpreter.evaluate_call_recurse!), ::Any, ::JuliaInterpreter.Frame, ::Expr) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:216
 [2] evaluate_call_recurse! at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:205 [inlined]
 [3] eval_rhs(::Any, ::JuliaInterpreter.Frame, ::Expr) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:371
 [4] step_expr!(::Any, ::JuliaInterpreter.Frame, ::Any, ::Bool) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:504
 [5] step_expr!(::Any, ::JuliaInterpreter.Frame, ::Bool) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:543
 [6] finish!(::Any, ::JuliaInterpreter.Frame, ::Bool) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:14
 [7] finish_and_return! at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:29 [inlined]
 [8] #evaluate_call_recurse!#37(::Bool, ::typeof(JuliaInterpreter.evaluate_call_recurse!), ::Any, ::JuliaInterpreter.Frame, ::Expr) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:242
 ... (the last 7 lines are repeated 6 more times)
 [51] evaluate_call_recurse! at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:205 [inlined]
 [52] eval_rhs(::Any, ::JuliaInterpreter.Frame, ::Expr) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:371
 [53] step_expr!(::Any, ::JuliaInterpreter.Frame, ::Any, ::Bool) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:421
 [54] step_expr! at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/interpret.jl:543 [inlined]
 [55] next_until!(::Any, ::Any, ::JuliaInterpreter.Frame, ::Bool) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:93
 [56] next_until! at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:101 [inlined]
 [57] next_line!(::Any, ::JuliaInterpreter.Frame, ::Bool) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:175
 [58] #debug_command#54(::Nothing, ::typeof(JuliaInterpreter.debug_command), ::Any, ::JuliaInterpreter.Frame, ::Symbol, ::Bool) at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:404
 [59] debug_command at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:386 [inlined]
 [60] #debug_command#56 at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:458 [inlined]
 [61] debug_command at /local/home/fredrikb/.julia/packages/JuliaInterpreter/rYo68/src/commands.jl:458 [inlined]
 [62] (::getfield(Atom.JunoDebugger, Symbol("##45#47")){Bool})() at /local/home/fredrikb/.julia/packages/Atom/E4PBh/src/debugger/stepper.jl:129
 [63] evalscope(::getfield(Atom.JunoDebugger, Symbol("##45#47")){Bool}) at /local/home/fredrikb/.julia/packages/Atom/E4PBh/src/debugger/stepper.jl:369
 [64] startdebugging(::JuliaInterpreter.Frame, ::Bool) at /local/home/fredrikb/.julia/packages/Atom/E4PBh/src/debugger/stepper.jl:96

Found the answer myself
https://github.com/JuliaDebug/JuliaInterpreter.jl/issues/219
I tried to

julia> push!(JuliaInterpreter.compiled_modules, OpenAIGym)
Set(Module[Base.Threads, OpenAIGym, Core.Compiler])

but it did not solve the problem. Any known workaround?

I see that JuliaInterpreter.jl lists disable_sigint etc. but not sigatomic_(begin|end) (which probably does not make sense)

https://github.com/JuliaDebug/JuliaInterpreter.jl/blob/2c7ef5788352db40b3bfab7500a2cf52471f9de6/src/JuliaInterpreter.jl#L56-L59

Unfortunately, PyCall directly uses sigatomic_(begin|end) (presumably because the API was not consolidated at the time of implementation). It sounds related but I’m not familiar with Julia debugger internal.

I tried to switch to disable_sigint sometime before (for another reason) https://github.com/JuliaPy/PyCall.jl/pull/574 but it was stalled due to a (possible) bug in Julia. Not sure if the bug is still there though. Maybe something similar or less drastic can fix the debugger related issue.

Does it make sense to push PyCall?

Yeah that probably makes more sense, I’ll try that next. Thanks for the suggestion!

It is possible to write methods that are not safe to interpret (and Base has a few of them). These are special cased to run compiled by putting them in JuliaInterpeter.compiled_methods or JuliaInterpreter.compiled_modules (for compiling all functions in a module).

You can try pushing the method causing the error into JuliaInterpeter.compiled_methods or whole of PyCall into JuliaInterpreter.compiled_modules if you don’t want to debug anything going through PyCall.

2 Likes

Thanks, running entire PyCall compiled worked around the issue!

1 Like

I see that JuliaInterpreter.jl lists disable_sigint etc. but not sigatomic_(begin|end) (which probably does not make sense)

It does make sense, actually. You have to push! the methods that use sigatomic_(begin|end). There is no point in push!ing sigatomic_(begin|end) themselves, because if you encounter them while interpreting a method body, it’s already too late.

I meant to say “it does not make sense for JuliaInterpreter.jl to list sigatomic_(begin|end).” IIUC you don’t want to interpret the code in between sigatomic_begin and sigatomic_end. Right?

Exactly. Sorry I misunderstood what you meant.

1 Like

Repeat what I said elsewhere,

Although such method might exist, almost all of them should be the ones that directly manipulate the interpreter itself. Almost all the ones I see that’s defined in JuliaInterpreter are either julia bugs or JuliaInterpreter bugs and none of them should be unsafe. In another word, a JuliaInterpreter unaware user cannot write methods that are not safe to interpret.

Other than the few that crashes or had bad performance, they are simply due to the fact that some underlying intrinsics (certain ccall and other builtins) requires special execution support. The reason this special handling is needed could be due to the semantics or some julia implementation limitation. The former one will always need such special handling whereas the later could be removed once improvements (not bug fix) on the julia side is made.

Note that the special handling is NOT what appears to be currently used on these intrinsics. What needs special handling is the intrinsics themselves, not the user of them. In the case of pointerset, you need to intercept that and replace it with a interpretable version, which is basically unsafe_store!. Again, you should not intercept unsafe_store! as is currently done but you should intercept pointerset and replace it with unsafe_store!.

pointerset is a case where a base improvement could remove the necessity for the special handling. In the case of sigatomic though, the speical handling is always needed. You basically need to maintain a counter for sigatomic. You can either do this by intercepting the ccall in sigatomic_* (again, not defer_sigint or the julia function sigatomic_*) or you can update your counter based on the execution result. The latter needs a ~5 line addition to julia itself (~15 lines if you want high efficiency compiler support).

Another related question is that if a method is run in the compiled mode, will the callback passed into it be compiled as well? If not, it might not be as much a big deal. If so, then this is definitely an interpreter bug that should be properly fixed.

Sorry, I was unclear. What I meant to say was that it is possible to write methods that with the current implementation of JuliaInterpreter.jl (including its bugs) are unsafe to interpret. You are of course right that it is a bug in the interpreter (and it is also marked with the bug label on the issue tracker Juno.@enter exits with error when running PyPlot.plot() · Issue #219 · JuliaDebug/JuliaInterpreter.jl · GitHub). Apparently, no one has so far thought it have been important enough to start working on it.

Currently, yes.

Yes, these are just workarounds for Julia/JuliaInterpreter bugs or limitations; depending on how much backporting happens, we may need it at least throughout the Julia 1.x release cycle. The mechanism was introduced to circumvent Illegal instruction with `ccall` to :memcpy · Issue #31073 · JuliaLang/julia · GitHub. Re pointerset, see Cannot call pointerset at toplevel · Issue #31182 · JuliaLang/julia · GitHub.

Another related question is that if a method is run in the compiled mode, will the callback passed into it be compiled as well?

Yes, unfortunately. Once you enter the world of compiled execution, currently there is no going back.

There’s plenty of important and interesting work to do in the JuliaInterpreter world, and contributors would be most welcome. The problem I find most compelling concerns interpreter performance, and the evolving thoughts are perhaps best documented in https://github.com/JuliaDebug/JuliaInterpreter.jl/pull/204. I suspect it’s basically a research project, and one that will require a pretty big chunk of time that I currently lack but hope to return to someday.

3 Likes

This PR makes the JuliaInterpreter’s workaround kicks in automatically for PyCall:
https://github.com/JuliaPy/PyCall.jl/pull/686

2 Likes

This is my code

using PyCall
function main()
    np = pyimport("numpy")
    data = np.load("data.npy")
end

In REPL, If I am not doing push!(JuliaInterpreter.compiled_modules,PyCall) then I am getting the ERROR: sigatomic_end called in non-sigatomic region. However If I do it and enter the function using @enter main() then the execution halts at data = np.load("data.npy"). It takes a long time at this line and never comes out of it.

This might just be that the runtime performance of the debugger is quite bad. I would suggest trying to run it on the smallest possible input files.

2 Likes

ok, so is there no way I can load such big file and debug? Is there some workaround that I can use?

Do you really want to debug the np.load call itself? Otherwise, you could hoist that to outside the bugging with something like

np = pyimport("numpy")
data = np.load("data.npy")

function main(data)
    ...
end

@enter main(data)

Yeah this would work. Also I am wondering, Is there any way I can debug a Julia script. Wrapping a code snippet into a function every time is not very convenient. I am coming from Python background and habitual of using ipdb.set_trace. Debugger is awesome just one issue is debugging a script.
Thanks :slight_smile:

There are some technical difficulties in Julia with debugging scripts (debugging global scope). They could be worked around but since wrapping in a function isn’t too bad, the priority of this feature is not very high.

1 Like