Any suggestions for speeding this simple scalar function up?

lmiq · August 20, 2021, 9:04pm

You seem to have tried some avx variant, but is it there? Here @turbo makes a huge difference:

julia> using LoopVectorization

julia> function EOS1(rho;c0=100,gamma=7,rho0=1000)
           b=(c0^2*rho0)/gamma
           P = b*((rho/rho0).^gamma .- 1)
       end
EOS1 (generic function with 1 method)

julia> function EOS1_turbo(rho;c0=100,gamma=7,rho0=1000)
           b=(c0^2*rho0)/gamma
           P = @turbo b*((rho/rho0).^gamma .- 1)
       end
EOS1_turbo (generic function with 1 method)

julia> EOS1(rho) ≈ EOS1_turbo(rho)
true

julia> @btime EOS1($rho);
  68.043 μs (3 allocations: 23.81 KiB)

julia> @btime EOS1_turbo($rho);
  1.809 μs (3 allocations: 23.81 KiB)

If this will be called from within a hot loop, you probably want to preallocate b.

edit:

If you use @. to have better chances of not forgetting some loop fusion, that gets even better:

julia> function EOS1_turbo(rho;c0=100,gamma=7,rho0=1000)
           b=(c0^2*rho0)/gamma
           P = @turbo @. b*((rho/rho0)^gamma - 1)
       end
EOS1_turbo (generic function with 1 method)

julia> @btime EOS1_turbo($rho);
  701.667 ns (1 allocation: 7.94 KiB)

(the reason one would notice that that was needed is that there were 3 allocations in the previous version, where only P should be a new allocation there. Thus, some . is missing and an intermediate vector is being generated in the previous version, which should be: P = @turbo b .* ((rho ./ rho0).^gamma .- 1).

Topic		Replies	Views
Improving Performance of a Loop Performance	9	420	November 3, 2021
Speed up applying a function to matrices Performance	8	620	August 13, 2021
Any suggestion for speeding up my function? Performance question	11	434	November 17, 2022
Need help on improving performance of a small subfunction Performance	21	882	May 4, 2020
Could anyone make this run faster? Performance	21	582	December 26, 2022

Any suggestions for speeding this simple scalar function up?

Related topics