How to use Dagger.jl's blocks to read a multifile NCDataset?


I’m trying to load a multifile dataset using NCDatasets, and to speed things up I am trying to use dagger in this way

include("test/test_catarrays.jl") # this creates some small test files
mfds = Dataset(fnames);
X = Distribute(Blocks(2,3,1), variable(mfds,"var"))
collect(sum(X,dims = 3))
# same results as
sum(variable(mfds,"var")[:,:,:],dims = 3)

This is something experimental in NCDatasets and I saw it in here Now, I’ve tried to use it to load my own variables but I don’t really get the meaning of Blocks and how should I select that. So far I have only discovered that if I pass Blocks a different number of arguments than the dimensions in the array to be read from the file, it throws an error. Other than that I don’t seem to notice how to make it work, sometimes it just stays there forever without completing the task.

Does anyone have any experience using Dagger that can throw a light on this?

1 Like

Hey again! Two things:

  1. Can you post an MWE so we can easily reproduce what you’re trying to do?: Please read: make it easier to help you

  2. Your two questions at the end of your post were accidentally included in your code block; you should move them outside the block so it’s obvious that you want people to answer them :stuck_out_tongue:

I will try to prepare a MWE. Fell free to remove the post and I can create it again with the full info!

No need to delete it, just edit the original post and then reply here so we know that it’s been updated :slightly_smiling_face: