[ANN] DataLoaders.jl (alpha) - basically PyTorch's parallel `DataLoader`

Happy to give a first look at a (as of yet unregistered) package DataLoaders.jl that has a similar API to PyTorch’s DataLoader.

See examples in the README.md

I wrote this primarily to support deep learning pipelines that load images, apply heavy preprocessing, and batch the samples. To do this without slowing down the training this has to be done on multiple threads and non-blockingly.

A few months ago I asked if something like this already exists in the Julia ecosystem, but did not find anything that suited my needs, so I decided to roll my own.

Note: in a recent release of Flux.jl a DataLoader was added, but as far as I can see, the implementation is neither parallel nor non-blocking.

I’d be happy about feedback, criticism, and whether you find the package helpful.

17 Likes