Flux model on CPU runs slowly

AquaIndigo · October 4, 2020, 2:22am

I find the Flux model on CPU runs more slowly than that on GPU:

julia> m = Chain(
           Dense(2250, 500, σ),
           Dense(500, 50, tanh),
           Dense(50, 7, σ));

julia> X = Float32.(X);

julia> size(X)
 (2250, 484)

julia> @btime m(X);
  10.718 ms (10 allocations: 2.06 MiB)

julia> X_gpu = X |> gpu;

julia> m_gpu = m |> gpu;

julia> @btime m_gpu(X_gpu);
  21.864 μs (134 allocations: 3.80 KiB)

And training the model on CPU was untolarably slow, and I am really confused.

ChrisRackauckas · October 4, 2020, 2:36am

That’s a really big matmul for a GPU . It should be about the same cost on any CPU implementation since its all going to be in a BLAS kernel.

AquaIndigo · October 4, 2020, 2:52am

And when I trained the same Pytorch model on CPU, it seemed much faster than the Flux model on CPU.

Oscar_Smith · October 4, 2020, 2:54am

Julia’s tanh is fairly slow, but the bottleneck really should be the matmul

Topic		Replies	Views
Flux running slow? Machine Learning	16	2748	August 19, 2021
Why is flux model slower than python? Performance benchmark , flux	6	2303	May 11, 2024
Flux.jl: training fails at GPU but works on CPU Machine Learning gpu , flux	1	630	September 19, 2019
Flux benchmark being too slow vs Jax Machine Learning gpu	11	1686	February 15, 2023
Flux on GPU too slow Machine Learning gpu , cuda , flux	5	1118	September 22, 2022

Flux model on CPU runs slowly

Related topics