Flux and cpu cores

johnbb · September 1, 2020, 9:29am

I ran a few tests which confirm that BLAS.set_num_threads(1) should be set. On my system the code ran ~10 times faster. Please see below for codes and results

# start Julia with JULIA_NUM_THREADS=1 julia
using Flux
using BenchmarkTools
using LinearAlgebra
n = 100_000
p = 50
x = rand(Float32, p, n)
y = rand(Float32, n)    
trdata = Flux.Data.DataLoader(x, y, batchsize=100)
m = [Chain(Dense(p, 100), Dense(100,100), Dense(100,1)) for i in 1:4]
@btime for i in 1:4
    loss(x, y) = Flux.mse(m[i](x), y)
    Flux.@epochs 1 Flux.train!(loss, Flux.params(m[i]), trdata, Flux.ADAM())
end
#  6.286 s (1992500 allocations: 2.24 GiB)

# start Julia with JULIA_NUM_THREADS=4 julia
using Flux
using BenchmarkTools
using LinearAlgebra
n = 100_000
p = 50
x = rand(Float32, p, n)
y = rand(Float32, n)    
trdata = Flux.Data.DataLoader(x, y, batchsize=100)
m = [Chain(Dense(p, 100), Dense(100,100), Dense(100,1)) for i in 1:4]
@btime Threads.@threads for i in 1:4
    loss(x, y) = Flux.mse(m[i](x), y)
    Flux.@epochs 1 Flux.train!(loss, Flux.params(m[i]), trdata, Flux.ADAM())
end
#  10.864 s (1992523 allocations: 2.24 GiB)  

# start Julia with JULIA_NUM_THREADS=4 julia
using Flux
using BenchmarkTools
using LinearAlgebra
BLAS.set_num_threads(1)
n = 100_000
p = 50
x = rand(Float32, p, n)
y = rand(Float32, n)    
trdata = Flux.Data.DataLoader(x, y, batchsize=100)
m = [Chain(Dense(p, 100), Dense(100,100), Dense(100,1)) for i in 1:4]
@btime Threads.@threads for i in 1:4
    loss(x, y) = Flux.mse(m[i](x), y)
    Flux.@epochs 1 Flux.train!(loss, Flux.params(m[i]), trdata, Flux.ADAM())
end
#  1.076 s (1992515 allocations: 2.24 GiB)

Topic		Replies	Views
How can I make Flux use all my CPUs? General Usage	10	3157	March 15, 2019
Flux multiple cores New to Julia question	0	296	November 9, 2020
Flux.jl and the state of multi-processing Machine Learning	2	1632	February 27, 2019
Flux parallel execution Machine Learning flux	3	2766	March 29, 2019
Why more BLAS threads take more time Performance threads	2	495	September 9, 2022

Flux and cpu cores

Related topics