Adding @sync or (vcat) to @distributed produces error

I am preparing my code to run it on a cluster using SLURM and exploiting multiple cores using the Distributed package. It’s a bit involved and I haven’t been able to reproduce the error in an MWE.

The point is to run a number of simulations using the command simulate but from different initial states (randomly sampled in this part of the project). I then want to stack the DataFrames each simulation produces as one (therefore, I use (vcat) and “df =” )

This fails and produces an error that randomInitialState() is not defined for worker 2.
However, if I run the for loop inline (in Juno), the code compiles as expected.
Also, if I remove @sync, vcat, and saving the result into df, the code runs.

I find this behavior surprising, and I haven’t found anyone who breaks their code by adding @sync. I wonder if anyone have suggestions what might be going on.

A sketch of the program

pids = addprocs(2)
@everywhere begin
   using DataFrames
   include("loadsDataAndCreates_randomPath_below.jl")
   parameters = 1.0
end
df =
@sync @distributed (vcat) for i in eachindex(1:num_simulations)
   state0_i = randomInitialState()
   df_i = simulate(state0_i, parameters) # This returns a DataFrame
end
rmprocs(pids)

The error is long, but starts with
LoadError: TaskFailedException
nested task error: On worker 2:
UndefVarError: #randomInitialState#229 not defined
Stacktrace:
[1] deserialize_datatype
@ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Serialization\src\Serialization.jl:1280
[2] handle_deserialize

I am at a loss to Google how serialization (a new topic for me) is interfering with my code.

I wasn’t able to resolve the problem here, but by converting to the following structure (after reading https://portal.research.lu.se/portal/en/publications/parallel-computing-in-julia(873af4d5-6229-4ad2-b907-c0ae0f667822).html), I now have a functioning program.

pids = addprocs(2)
@everywhere begin
   using DataFrames
   include("loadsDataAndCreates_randomPath_below.jl")
   parameters = 1.0
end
futures = Vector{Future}(undef, num_simulations)
for i in eachindex(1:num_simulations)
   state0_i = randomInitialState()
   futures[i] = @spawnat :any simulate(state0_i, parameters) # This returns a DataFrame
end
simulations = fetch.(futures)
df = [append!(simulations[1], simulations[i]) for i in 2:num_simulations]
rmprocs(pids)

I am not certain this performs as @sync @distributed would (I was surprised how quickly the code ran when doing it inline as described in the original question). But at least this runs quicker than what it did before.