Parallel data loading to GPU arrays



Anyone have pointers to loading data into CuArrays with background tasks while the main thread or process is busy training a Flux model?

Flux seems to have the same if not more flexibility than pytorch, so they fall in the same niche, and I would like to try out Flux instead for my next project.

However my experience is that for the kinds of loosely structured data that such flexibility helps the most with, loading and packaging the data for the model to consume quickly becomes the bottleneck. pytorch provides multiprocess DataLoaders for this. How would I do the same in the Julia ecosystem?


Does @async not do it?