PythonCall.jl with CondaPkg.jl in MPI scenario

Hi everyone,

I’m working on a julia package that has PythonCall.jl and CondaPkg.jl in its dependencies.

Now, I want to do some calculations using my package on a cluster with MPI.jl. The problem is (this is my understanding) that whenever I use my package, CondaPkg.jl has to resolve the environment, which cannot be done in parallel (the resolve function in CondaPkg.jl explicitly creates a lock file, so I get a lot of

Info: CondaPkg: Waiting for lock to be freed. You may delete this file if no other process is resolving.

).
This means that for a large world size, my compute job times out because it takes too long to resolve on every rank.

I know however that nothing changed in the conda environment since I last called resolve, so in theory (in my head) it should be possible to skip this part in this case.

Does someone have an idea how I can solve this problem?
Thank you in advance :slight_smile:

Here is a link to the CondaPkg.jl resolve function: https://github.com/JuliaPy/CondaPkg.jl/blob/c0ee1ecac08f9281276fd759e32663f0dee93213/src/resolve.jl#L505C10-L505C17

See also CondaPkg + PythonCall does not behave nicely on read-only filesystems · Issue #142 · JuliaPy/CondaPkg.jl · GitHub

It looks like there is no option to disable the lockfile, but the author is open to a PR to add that feature.

Alternatively, you can simply manage your own Python installation and point PythonCall to that. That’s what I would recommend for now in a cluster environment.

3 Likes

Thanks a lot! The issue that you pointed out essentially boils down to the same problem I’m experiencing. I guess I will try my luck on a PR.

Thanks again!