Is there any way to use cd with multithreading?

I’m currently working on a simulation server for running simulations with our simulation engine. The simulation input is uploaded via a Python flask API, but the simulation code and the code handling execution is in Julia. Because this is a simulation server, it is supposed to make use of all the compute resources it is given, but the simulation itself is necessarily single-threaded, so the parallel computing comes down to being able to run multiple simulations at the same time.

The current architecture works like that: The API handles requests for uploads, starting simulations, fetching results, etc. and marks a run to be simulated. A seperate Julia program runs in an endless loop, scanning directories for runs that have been marked as to be executed, then spawns a thread to handle that and continues the loop.

I got this almost working with multithreading, but I ran into issues with CWD-relative file paths interfering between threads, in particular the main thread that is running the scanning loop. Turns out cd(function, dir, args) changes the CWD for all threads. I’m sure there is a technical reason for that.

My options are now 1.) refactor the simulation program to not use CWD and always use the run directory as the base path or 2.) use multiprocessing instead, which I briefly tried but that brings its own bundle of issues to be solved or 3.) somehow find a solution to keep CWD separate for each thread.

I’m hoping for 3.) because both 1.) and 2.) are unknown but probably significant time investments. Any ideas?

I think on most operating systems, the current working directory is an operating system concept belonging to a process that cannot differ between threads. Julia cannot offer you a functionality that the operating system doesn’t support.

But, you’re in luck. Don’t use relative paths when speaking to the operating system. Instead, use absolute paths.

So, the first thing to refactor is to replace all uses of relative paths by

Base.Filesystem.joinpath(my_pwd(), relative_path)

i.e. you need to add a prefix.

Now the only question is how to conveniently set the current pseudo-working directory. Since you like the cd(function, dir) API, you will need to change all such calls to

my_cd(fun, dir)

Now, how to implement that thing?

Ideally you use ScopedValue:

julia> const MY_CWD = Base.ScopedValues.ScopedValue("");
julia> my_cd(fun, dir) = Base.ScopedValues.with(fun, MY_CWD => Base.Filesystem.joinpath(my_pwd(), dir));
julia> my_pwd() = MY_CWD[]

That way this works appropriately if your tasks spawn more tasks.

On the other hand, you will have a painful time if you previously used the cd(::String) API.

1 Like