Should Julia be able to (really) kill a worker?

Henrique_Becker · May 16, 2020, 1:05am

There is a Base.kill method that kills a Process. Distributed has a rmprocs method that tries to kill a worker but fails in many situations.

A rmprocs call is not able to kill the worker if it is running a Julia code that does not yield or that is running C code underneath. There is no way (AFAIK) of getting a Process object from an worker. There is a Base.Libc.getpid that can be run on the worker, but there is no way to get a Process (AFAIK) from it (so Base.kill cannot be called using it).

Underneath the Base.kill there is a ccall(:uv_process_kill, Int32, (Ptr{Cvoid}, Int32), p.handle, signum), that seems to be equivalent to ccall(:uv_kill, Cint, (Cint, Cint), worker_pid, signum) (where worker_pid is the result from calling Base.Libc.getpid inside the worker). The ccall(:uv_kill, ...) is already possible to do in Julia because libuv is used internally, but this is a implementation detail, so it should not be relied upon.

My question (finally) is: as there is machinery inside Julia to kill a process in a OS independent way, and it will probably stay (as Base.kill needs to be implemented some way) would it not be nice to extend it just a little to be able to kill workers from Distributed or just anything you are able to run Base.Libc.getpid inside?

Sorry if I am missing something obvious.

Topic		Replies	Views
Kill a (distributed) child process General Usage	2	613	February 9, 2023
How to check if a pid is currently running? General Usage operating-system	7	2382	February 22, 2022
How do I run a function in a process and afterwards terminate the process? New to Julia question , distributed	0	309	August 9, 2021
Distributed.jl Processes not Shutting Down until Walltime on PBS Julia at Scale hpc , parallel , distributed , pbs	7	473	November 19, 2022
Failed to remove pidfile on close Performance question , parallel , distributed	0	883	March 12, 2024

Should Julia be able to (really) kill a worker?

Related topics