Distributed testing

I am developing a set of ]dev’d packages. If I test them in serial, it takes about 2 hours.

I have been doing this instead:

# The script is intended to be run from `parent_dir/test`.
using Distributed
addproc(7)

@everywhere using Pkg
# It starts by activating the parent environment, which contains all the child packages
@everywhere Pkg.activate("..")

# It then tests them in parallel
packages = ["list","of","packages"]
tasks = Vector{Task}(undef, length(packages))
for (i, package) in enumerate(packages)
    tasks[i] = @spawnat :any Pkg.test(package; test_args=ARGS, coverage = "--coverage" in ARGS)
end

results = fetch.(tasks)

errs = [i for (i, val) in enumerate(results) if isa(val, RemoteException) ]

if !isempty(errs)
    println("errors were throw, here is one:")
    throw.(res[errs])
end

It works in general, but sometimes it throws an error on a worker that it cannot find one of the locally ]dev’d packages that definitely exists in the parent environment’s manifest. I can’t make a reproducible example of this that I can share externally, but it’s very repeatable internally.

Could this be some kind of collision of test environments? I’m under the impression that it’s generating a random name for the temp folder it uses, but ???.

Additionally, and possibly unrelatedly, it was much more reliable on Julia 1.3.1 and has become a real problem since this morning when I upgraded to 1.5.0-rc2

Edit: I got the tests to pass by setting the number of workers high enough that each Pkg.test call happened on a different worker.