Flux cpu() type stability

RobertGregg · February 17, 2022, 8:05am

While reviewing a loss function I wrote, I noticed that the cpu() function from Flux produced type instability:

julia> using Flux

julia> @code_warntype cpu(rand(10))
MethodInstance for Flux.cpu(::Vector{Float64})
  from cpu(x) in Flux at ~/.julia/packages/Flux/qAdFM/src/functor.jl:146
Arguments
  #self#::Core.Const(Flux.cpu)
  x::Vector{Float64}
Locals
  #136::Flux.var"#136#137"
Body::Any
1 ─      (#136 = %new(Flux.:(var"#136#137")))
│   %2 = #136::Core.Const(Flux.var"#136#137"())
│   %3 = Flux.fmap(%2, x)::Any
└──      return %3

This propagated throughout the loss function, which probably isn’t good, but I haven’t benchmarked it yet. Is this a known issue? I’m using Flux v0.12.9 just fyi.

Thanks in advance for any insight!

albheim · February 17, 2022, 9:29am

Did a quick check, also on Flux 0.12.9, and it seems to have some effect when the calculation cost is not too big. Running for larger matrices seems to diminish the difference. Couldn’t find anything by a quick search in the issues or PRs in github, so maybe worth adding there?

julia> a = randn(5, 5);

julia> b = randn(5, 5);

julia> f1(a, b) = a * b
f1 (generic function with 2 methods)

julia> f2(a, b) = a * cpu(b)
f2 (generic function with 2 methods)

julia> @code_warntype f1(a, b)
MethodInstance for f1(::Matrix{Float64}, ::Matrix{Float64})
  from f1(a, b) in Main at REPL[39]:1
Arguments
  #self#::Core.Const(f1)
  a::Matrix{Float64}
  b::Matrix{Float64}
Body::Matrix{Float64}
1 ─ %1 = (a * b)::Matrix{Float64}
└──      return %1


julia> @code_warntype f2(a, b)
MethodInstance for f2(::Matrix{Float64}, ::Matrix{Float64})
  from f2(a, b) in Main at REPL[40]:1
Arguments
  #self#::Core.Const(f2)
  a::Matrix{Float64}
  b::Matrix{Float64}
Body::Any
1 ─ %1 = Main.cpu(b)::Any
│   %2 = (a * %1)::Any
└──      return %2


julia> @btime f1($a, $b)
  190.318 ns (1 allocation: 256 bytes)
5×5 Matrix{Float64}:
  0.473592   0.450893  -2.49009   1.48155     4.01978
 -0.636979   0.493621  -1.71473   0.0855036   3.45803
  0.806991   1.10065   -3.76954   2.63519     4.33381
  1.01176    0.280512  -1.57056   2.21027    -0.41367
 -1.80214   -1.33488    4.65632  -3.00634    -1.24369

julia> @btime f2($a, $b)
  298.087 ns (3 allocations: 592 bytes)
5×5 Matrix{Float64}:
  0.473592   0.450893  -2.49009   1.48155     4.01978
 -0.636979   0.493621  -1.71473   0.0855036   3.45803
  0.806991   1.10065   -3.76954   2.63519     4.33381
  1.01176    0.280512  -1.57056   2.21027    -0.41367
 -1.80214   -1.33488    4.65632  -3.00634    -1.24369

dhairyagandhi96 · February 18, 2022, 6:23pm

Worth checking against an old version (v0.11, v0.12.3) and master as well. Most times it shouldn’t be that cpu is required to be called very frequently (esp in code that needs to be differentiated). It might be that for small enough arrays the instability increases inference cost at runtime. The operations would need to be runtime performance competitive when the smaller array is on the GPU then I suppose.

RobertGregg · February 18, 2022, 11:02pm

I checked v0.12.3 and got the same result. It’s honestly not a big deal because I can hard-code the type to Float32 for now. In most cases I think you’d be right about cpu() not being called often, but I was following this example about variational autoencoders where cpu() or gpu() is called every time the loss function is so it might have some impact there. I’ll raise an issue on their github just as an fyi. Thank you for the help!

Topic		Replies	Views
Flux convolutional layer not type stable Machine Learning performance , flux	0	615	May 14, 2020
Type Stability Help Performance type-stability	5	266	February 6, 2023
Flux multi-cpu parallelism? New to Julia question , flux , zygote	9	2935	November 21, 2020
Flux.jl: training fails at GPU but works on CPU Machine Learning gpu , flux	1	630	September 19, 2019
Flux model on CPU runs slowly Performance question , flux	3	426	October 4, 2020

Flux cpu() type stability

Related topics