I try to implement a convolutional auto encoder in Flux.
The model is composed essentially of several convolutional layers, max pool, interpolation to the nearest neighborhood (using
repeat(...; inner = (2,2,1,1)). The issue is that the later is not yet implemented to run on a GPU (https://github.com/JuliaGPU/GPUArrays.jl/pull/126).
However, if I try to use Flux.jl on a CPU, only one CPU is used despite setting
JULIA_NUM_THREADS before starting Julia (as suggested here: Flux parallel execution):
export JULIA_NUM_THREADS=2 julia # ...
Is this the correct approach?
Does Flux.jl support multi-processing? Or does it only rely on threaded implementations for BLAS for dens layers which are not of use for convolutional layers?
Is there any other machine learning framework able to use multiple CPUs in Julia?
Thanks a lot for any insights!