Flux.jl and the state of multi-processing

I try to implement a convolutional auto encoder in Flux.
The model is composed essentially of several convolutional layers, max pool, interpolation to the nearest neighborhood (using repeat(...; inner = (2,2,1,1)). The issue is that the later is not yet implemented to run on a GPU (GPU support for Base.repeat by americast · Pull Request #126 · JuliaGPU/GPUArrays.jl · GitHub).

However, if I try to use Flux.jl on a CPU, only one CPU is used despite setting JULIA_NUM_THREADS before starting Julia (as suggested here: Flux parallel execution):

export JULIA_NUM_THREADS=2
julia
# ...

Is this the correct approach?
Does Flux.jl support multi-processing? Or does it only rely on threaded implementations for BLAS for dens layers which are not of use for convolutional layers?

Is there any other machine learning framework able to use multiple CPUs in Julia?
Thanks a lot for any insights!

2 Likes

AFAIK, (almost) no julia code used by Flux is using multithreading, all parallelization comes from e.g. BLAS.

5 Likes

Thanks a lot for confirming.