Trying out Zarr.jl for comparison with NetCDF.jl
I ran into an issue with distributed writes, where workers were silently hanging.
I thought I could pass the data store created by zopen()
via pmap()
, similar to a SharedArray or DistributedArray, but it seems I have to get each worker to reopen the store.
Is this the best way to approach writing to a common data store?
using Zarr, Distributed
using Base.Iterators
# Spin up workers if needed
if nprocs() == 1
addprocs(4, exeflags="--project=.")
@everywhere begin
# Activate environment and precompile
using Pkg; Pkg.activate(@__DIR__)
Pkg.instantiate(); Pkg.precompile()
using Zarr
z_fn = "./zdev.zarr"
end
end
# Set up Zarr store
d_dims = (100, 100, 16, 4)
z1 = zcreate(Float32, d_dims..., path=z_fn, fill_value=0.0, chunks=(d_dims[1], d_dims[2], d_dims[3], 1))
@everywhere begin
function simulate!(z_fn, reps, scen)
# Each worker has to open the data store, else worker silently hangs
zr = zopen(z_fn, "w")
# Run "model" and store results
for i in 1:reps
zr[:, :, i, scen] .= rand(100, 100)
end
end
end
_ = pmap((x) -> simulate!(z_fn, x[1], x[2]), product(d_dims[3], 1:d_dims[4]));