Should @FastPow not make this faster?

Hello everyone!

Suppose I have a set of parameters as:

ρ₀ = 1000; c₀ = 20; γ = 7; ρ = ρ₀ .+ ρ₀*rand(10^6)*0.1;

And I wish to evaluate the function:

function Pressure(ρ,c₀,γ,ρ₀)
    return ((c₀^2*ρ₀)/γ) * ((ρ/ρ₀)^γ - 1)
end

Benchmarking the code as follows gives:

@benchmark Pressure.($ρ,$Cb,$γ,$ρ₀)
BenchmarkTools.Trial: 707 samples with 1 evaluation.
 Range (min … max):  6.260 ms … 12.147 ms  ┊ GC (min … max): 0.00% …  0.00%
 Time  (median):     6.604 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.070 ms ±  1.258 ms  ┊ GC (mean ± σ):  5.35% ± 11.09%

  ▄▆██▇▅▄▃▁▁▁
  █████████████▇▅▄▆▆▆▆▅▆▁▄▁▄▄▁▁▁▁▁▁▁▄▁▁▁▁▁▁▄▁▄▁▆▇███▇▇▅▇▁▆▁▆ █
  6.26 ms      Histogram: log(frequency) by time     11.5 ms <

And using @fastpow:

@benchmark @fastpow Pressure.($ρ,$Cb,$γ,$ρ₀)
BenchmarkTools.Trial: 706 samples with 1 evaluation.
 Range (min … max):  6.279 ms … 13.528 ms  ┊ GC (min … max): 0.00% … 45.26%
 Time  (median):     6.621 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.083 ms ±  1.266 ms  ┊ GC (mean ± σ):  5.34% ± 11.02%

  ▁▆▇█▇▅▄▃▁▁                                        ▁
  ██████████▇▅▆█▆▇▆▄▆▆▄▄▄▄▁▁▁▁▄▁▁▁▁▁▄▁▁▁▁▁▁▄▁▁▁▁▄▇▇██▇█▇▆▄▆▄ █
  6.28 ms      Histogram: log(frequency) by time     11.4 ms <

 Memory estimate: 7.63 MiB, allocs estimate: 2.

Which seems to give no difference in results. The exponentation happens since γ = 7.

Any apparent way to speed up this calculation?

Kind regards

That’s not how macros work. Macros rewrite unevaluated expressions into some other expressions. In particular, the @fastpow macro would rewrite expressions which contain calls to ^ into something else. But the raw expression Pressure.($ρ,$Cb,$γ,$ρ₀) doesn’t have any such calls, so there is no transformation to perform to start with. You can use the @macroexpand macro to see the effect of applying another macro.

However the README.MD file of the package shows that you can apply the macro to function definitions which contain calls to the power operator, which is likely what you want to do here.

1 Like

Thank you for explanation

I do as they state in documentation:

using FastPow
@fastpow function Pressure(ρ,c₀,γ,ρ₀)
    return ((c₀^2*ρ₀)/γ) * ((ρ/ρ₀)^γ - 1)
end

Then benchmark the code:

@benchmark Pressure.($ρ,$c₀,$γ,$ρ₀)
BenchmarkTools.Trial: 649 samples with 1 evaluation.
 Range (min … max):  6.287 ms … 16.706 ms  ┊ GC (min … max): 0.00% … 31.21%
 Time  (median):     7.064 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.696 ms ±  1.789 ms  ┊ GC (mean ± σ):  5.97% ± 11.83%

    █▄▄▁▄▁   
  ▅███████▇▆▄▃▂▃▂▃▂▂▃▂▂▁▂▂▂▂▁▂▂▂▁▂▂▂▂▃▃▃▃▃▃▃▃▂▃▂▂▂▂▃▂▂▁▂▁▂▁▂ ▃
  6.29 ms        Histogram: frequency by time        14.1 ms <

 Memory estimate: 7.63 MiB, allocs estimate: 2.

Which still gives around 6 ms and no improvement. Am I missing something else I need to do for it to work?

Kind regards

I’m away from keyboard so I can’t test myself, but the README of the FastPow.jl package talks about literal integer powers, your exponent is a variable, not a literal integer. Again, I encourage you to use the @macroexpand macro to see the effect of applying macros.

3 Likes

Thank you!

That was the step I had misunderstood, I thought one just had to ensure it was integer input. By putting in “7” directly instead of gamma I get:

@benchmark Pressure.($ρ,$c₀,$γ,$ρ₀)
BenchmarkTools.Trial: 2510 samples with 1 evaluation.
 Range (min … max):  1.128 ms … 10.429 ms  ┊ GC (min … max):  0.00% …  0.00%
 Time  (median):     1.516 ms              ┊ GC (median):     0.00%
 Time  (mean ± σ):   1.987 ms ±  1.516 ms  ┊ GC (mean ± σ):  21.78% ± 20.62%

  ▁█▄▁▁▃   
  ██████▆▅▄▃▃▂▂▂▂▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▂▃▃▃▃▃▃▂▂▂ ▃
  1.13 ms        Histogram: frequency by time         7.1 ms <

 Memory estimate: 7.63 MiB, allocs estimate: 2.

Which is 6 times faster and what I wanted :slight_smile: Sometimes gamma can vary (but still be Int) so I will see if I can get that to work some way

Kind regards

It is conceivable that @fastpow could work if the exponent was passed as a value-type? Or that could be combined with generated functions perhaps?

It does not work as it is:

julia> using FastPow

julia> f(x) = 1/x^13 - 1/x^7
f (generic function with 2 methods)

julia> f2(x) = @fastpow 1/x^13 - 1/x^7
f2 (generic function with 1 method)

julia> f3(x,::Val{N}) where N = @fastpow 1/x^(2N - 1) - 1/x^N
f3 (generic function with 1 method)

julia> v = rand(1000);

julia> f.(v) ≈ f2.(v) ≈ f3.(v,Val(7))
true

julia> @btime f.($rand(1000));
  13.027 μs (2 allocations: 15.88 KiB)

julia> @btime f2.($rand(1000)); # @fastpow doing its job
  2.223 μs (2 allocations: 15.88 KiB)

julia> @btime f3.($rand(1000),$Val(7));
  13.593 μs (5 allocations: 15.95 KiB)

Again, macros work on unevaluated expressions, it doesn’t make any difference if N is a variable with an integer value or a value type: to a macro that’s always the symbol N, not a literal integer:

julia> Meta.@dump 1/x^(2N - 1) - 1/x^N
Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol -
    2: Expr
      head: Symbol call
      args: Array{Any}((3,))
        1: Symbol /
        2: Int64 1
        3: Expr
          head: Symbol call
          args: Array{Any}((3,))
            1: Symbol ^
            2: Symbol x
            3: Expr
              head: Symbol call
              args: Array{Any}((3,))
                1: Symbol -
                2: Expr
                3: Int64 1
    3: Expr
      head: Symbol call
      args: Array{Any}((3,))
        1: Symbol /
        2: Int64 1
        3: Expr
          head: Symbol call
          args: Array{Any}((3,))
            1: Symbol ^
            2: Symbol x
            3: Symbol N

You can make @fastpow work with variables by complicating its internals which would probably slow it down, which I presume is the reason why it wasn’t done.

1 Like

It works if you are willing to do a @generated function:

julia> @generated function f4(x,::Val{N}) where N 
           return quote
               @fastpow 1/x^(2*$N -1) - 1/x^$N
           end
       end

julia> @generated function f5(x,::Val{N1}, ::Val{N2}) where {N1,N2}
           return quote
               @fastpow 1/x^($N1) - 1/x^$N2
           end
       end

julia> v7 = Val(7)

julia> v13 = Val(13)

julia> @btime f4.($rand(1000),$v7);
  7.394 μs (2 allocations: 15.88 KiB)

julia> @btime f5.($rand(1000),$v13, $v7);
  1.898 μs (2 allocations: 15.88 KiB)
2 Likes