How to edit source code of Julia

I believe I may have found a bug in Julia source code but I can’t pinpoint the exact line ([see this topic])(There is a bug in this function and I can't figure out what it is). What I want to do is modify the Distributed.jl module in Julia source code so that whenever I write using Distributed (or when a pacakge called using Distributed) my edited code is used.

This is what I’ve done so far. I’ve taken the actual source code files from Julia source and copied them into a new folder. Then I include this folder and call using .Distributed.

using Revise
includet("Distributed.jl")
using .Distributed
using ClusterManagers

a = addprocs(SlurmManager(2))

However, this dosnt work since ClusterManagers also does a using Distributed call and so it turns out that my edited Distributed != ClusterManagers.Distributed.

Is the only way to edit the Julia source code and recompile?

2 Likes

You should build Julia from source. Then edit the file in question and do:

using Revise, Distributed
Revise.track(Distributed)
# check that it works

If it works, then rebuild Julia to bake the changes into the system-image. (And contribute your fix upstream with a PR).

I don’t know if there are ways to easily do this when you’re using the Julia binaries.

1 Like

@mauro3 Thanks. I do this see in this documentation (System Image Building · The Julia Language)

But just a few clarification steps. I would need to build the system image on my cluster where I don’t have root access and everything will need to be done in my home directory. Is this possible?

I also don’t want to mess with the existing Julia binary installation nor its packages (it’s a cluster with 18 nodes using inifiniband and so the the same binary is installed on all 18 nodes).

Is there a way to rebuild Base into a library and use the existing Julia binary installed systemwide (clusterwide) to call that shared library?

The bug, I’d try to find on a local install, if at all possible.

Looks like you’re just updating one function. Then you can easily redefine the method in question by hand:

Distributed.message_handler_loop(r_stream::IO, w_stream::IO, incoming::Bool) = println("This version does nothing")

and make sure that the update is evaluated on all nodes. (Note sure how Revise would work on several nodes, could be tricky.)

I tried that but there are functions in Distributed that use global variables defined in the Distributed module. Everytime I redefine the function, it complains that the global variable is not found. I suppose when I redefine the function by import Distributed.message_handler_loop it actually brings it into Main where the global variable is not defined.

I also realized that Distributed is not in Base but rather in stdlib. I wonder if the process of rebuilding stdlib is the same as rebuilding Base.

Ok, then use eval to directly evaluate code inside a module:

julia> @eval Distributed message_handler_loop(r_stream::IO, w_stream::IO, incoming::Bool) = println("This version does nothing")
message_handler_loop (generic function with 1 method)

julia> Distributed.message_handler_loop(stdout, stdout, true)
This version does nothing

Again, make sure to do this on all nodes with @everywhere.

Thanks. I’ll try that.

Side note: I can’t use @everywhere since the bug is preventing the system to spawn and connect to the workers properly. It is addprocs() that bugs out.