Looking for the correct way to do a clean julia process shutdown

I’m not able to capture correctly the SIGINT signal and invoke a controlled shutdown.

It seems that atexit should be the answer, but in my (semplified) minimal example, with FileWatching:

using FileWatching

function watch(dir::String)
    try
        while true
            filename, event = watch_folder(dir, -1.0)
            @info "[$filename] event: $event"
        end
    catch e
        @warn "error watchdir: $e"
    end
end

function shutdown(dir::String)
    @info "stop watching $dir"
    unwatch_folder(dir)
end

dir = "/tmp"

shtdown() = shutdown(dir)
atexit(shtdown)

watch(dir)

I expect that before exiting the process will invoke shutdown, unwatch the folder and exit.

Instead this is what happens with Ctrl-C:

signal (2): Interrupt
in expression starting at /home/adona/dev/SINT/sint/sintjl/Sint/mve.jl:25
epoll_pwait at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
uv__io_poll at /workspace/srcdir/libuv/src/unix/linux-core.c:270
uv_run at /workspace/srcdir/libuv/src/unix/core.c:359
jl_task_get_next at /buildworker/worker/package_linux64/build/src/partr.c:473
poptask at ./task.jl:704
wait at ./task.jl:712 [inlined]
wait at ./condition.jl:106
take_buffered at ./channels.jl:387
take! at ./channels.jl:381 [inlined]
wait at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/FileWatching/src/FileWatching.jl:620
watch_folder at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/FileWatching/src/FileWatching.jl:747
watch at /home/adona/dev/SINT/sint/sintjl/Sint/mve.jl:6
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2231 [inlined]
...
Allocations: 2544 (Pool: 2533; Big: 11); GC: 0
[ Info: stop watching /tmp
β”Œ Error: Exception while generating log record in module Main at /home/adona/dev/SINT/sint/sintjl/Sint/mve.jl:15
β”‚   exception =
β”‚    schedule: Task not runnable
β”‚    Stacktrace:
β”‚     [1] error(::String) at ./error.jl:33
β”‚     [2] schedule(::Task, ::Any; error::Bool) at ./task.jl:591
β”‚     [3] schedule at ./task.jl:586 [inlined]
β”‚     [4] uv_writecb_task(::Ptr{Nothing}, ::Int32) at ./stream.jl:1051
β”‚     [5] poptask(::Base.InvasiveLinkedListSynchronized{Task}) at ./task.jl:704
β”‚     [6] wait at ./task.jl:712 [inlined]
β”‚     [7] uv_write(::Base.TTY, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:933
β”‚     [8] unsafe_write(::Base.TTY, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:1005
β”‚     [9] unsafe_write at ./io.jl:622 [inlined]
β”‚     [10] write(::Base.TTY, ::Array{UInt8,1}) at ./io.jl:645
β”‚     [11] handle_message(::Logging.ConsoleLogger, ::Base.CoreLogging.LogLevel, ::String, ::Module, ::Symbol, ::Symbol, ::String, ::Int64; maxlog::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Logging/src/ConsoleLogger.jl:161
β”‚     [12] handle_message at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Logging/src/ConsoleLogger.jl:100 [inlined]
β”‚     [13] macro expansion at ./logging.jl:332 [inlined]
β”‚     [14] shutdown(::String) at /home/adona/dev/SINT/sint/sintjl/Sint/mve.jl:15
β”‚     [15] shtdown() at /home/adona/dev/SINT/sint/sintjl/Sint/mve.jl:21
β”‚     [16] _atexit() at ./initdefs.jl:316
β”” @ Main ~/dev/SINT/sint/sintjl/Sint/mve.jl:15
a

FileWatching is just my first problem, after that will be db connections and sockets to close …

How to manage correctly an ordered shutdown in this scenario?

1 Like

The above example works making SIGINT capturable using Base.exit_on_sigint:

Base.exit_on_sigint(false)

But a more convoluted example, similar to my real use case, still raises an uncatched exception:

using FileWatching

Base.exit_on_sigint(false)

function cb(args)
   @info "hello from timer cb"
end

function callme()
    Timer(cb, 3)
end

function watch(dir::String, callback::Function)
    try
        while true
            @info "again ..."
            #sleep(3)
            filename, event = watch_folder(dir, -1.0)
            @info "[$filename] event: $event"
            if filename !== ""
                callback()
            end
        end
    catch e
        @warn "error watchdir $e"
    end
end

dir = "/tmp"

# Timer(cb, 3)
watch(dir, callme)

After a file event is captured and the timer callback runs A Ctrl-C gives:

^Cfatal: error thrown and no exception handler available.
InterruptException()
jl_mutex_unlock at /buildworker/worker/package_linux64/build/src/locks.h:143    [inlined]
jl_task_get_next at /buildworker/worker/package_linux64/build/src/partr.c:476
1 Like

I have the same problem here… I also use the Timer, that I think could be the cause of the problem. I’m quite new to the Julia language … This is my the minimal example:

Base.exit_on_sigint(false)

loop = true

atexit() do
    @info "cleaning before exit..."
    loop = false
end

function wkcb(args)
   @info "Wk CB"
end

function worker()
    try
        @info "Worker"
        t = Timer(wkcb, 2)
        wait(t)
        sleep(0.5)
        close(t)
        @info "Worker end"
    catch e
        @warn "worker error: $e"
    end
end

function run()
    try
        while loop
            @info "next start"
            @async worker()
            sleep(10)
        end
    catch e
        @warn "run error: $e"
    end
end

run()

When I press Ctr-C I get:

❯ julia mydem.jl
[ Info: next start
[ Info: Worker
[ Info: Wk CB
[ Info: Worker end
^Cfatal: error thrown and no exception handler available.
InterruptException()
jl_mutex_unlock at /buildworker/worker/package_linux64/build/src/locks.h:143 [inlined]
jl_task_get_next at /buildworker/worker/package_linux64/build/src/partr.c:476
poptask at ./task.jl:704
wait at ./task.jl:712 [inlined]
task_done_hook at ./task.jl:442
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2214 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2398
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1690 [inlined]
jl_finish_task at /buildworker/worker/package_linux64/build/src/task.c:198
start_task at /buildworker/worker/package_linux64/build/src/task.c:717
unknown function (ip: (nil))
[ Info: cleaning before exit...
β”Œ Error: Exception while generating log record in module Main at /home/claudio/SD/jl/mydem.jl:7
β”‚   exception =
β”‚    schedule: Task not runnable
β”‚    Stacktrace:
β”‚     [1] error(::String) at ./error.jl:33
β”‚     [2] schedule(::Task, ::Any; error::Bool) at ./task.jl:586
β”‚     [3] schedule at ./task.jl:586 [inlined]
β”‚     [4] uv_writecb_task(::Ptr{Nothing}, ::Int32) at ./stream.jl:1051
β”‚     [5] poptask(::Base.InvasiveLinkedListSynchronized{Task}) at ./task.jl:704
β”‚     [6] wait at ./task.jl:712 [inlined]
β”‚     [7] uv_write(::Base.TTY, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:933
β”‚     [8] unsafe_write(::Base.TTY, ::Ptr{UInt8}, ::UInt64) at ./stream.jl:1005
β”‚     [9] unsafe_write at ./io.jl:622 [inlined]
β”‚     [10] write(::Base.TTY, ::Array{UInt8,1}) at ./io.jl:645
β”‚     [11] handle_message(::Logging.ConsoleLogger, ::Base.CoreLogging.LogLevel, ::String, ::Module, ::Symbol, ::Symbol, ::String, ::Int64; maxlog::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /bu
ildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Logging/src/ConsoleLogger.jl:161
β”‚     [12] handle_message at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Logging/src/ConsoleLogger.jl:100 [inlined]
β”‚     [13] macro expansion at ./logging.jl:332 [inlined]
β”‚     [14] (::var"#1#2")() at /home/claudio/SD/jl/mydem.jl:7
β”‚     [15] _atexit() at ./initdefs.jl:316
β”” @ Main ~/SD/jl/mydem.jl:7
Exception handling log message: ⏎

Any suggestion? Thanks.

1 Like