Specifying a set of algorithms

dpsanders · November 12, 2020, 6:05pm

Suppose I have a function f and I need to specify which algorithms to use inside f (e.g. different optimization methods to combine together).
What is the best (i.e. most performant) way to do that?

MWE:

A(x) = 2x
B(x) = 3x
C(x) = 4x

function f(algorithms, x)
    for alg in algorithms
        x = alg(x)
    end

    return x
end

f((A, B, C, A, B), 1.5)

eliassno · November 12, 2020, 6:40pm

I’ve been thinking about the same thing recently.

If you know that each alg has an in-place implementation, you might want to preallocate your result before the loop.

using BenchmarkTools
using LinearAlgebra

# Original
A(x) = 2x
B(x) = 3x
C(x) = 4x

function f(algorithms, x)
    for alg in algorithms
        x = alg(x)
    end

    return x
end

println("Original:")
@btime f($(A, B, C, A, B), $[1.5, 3.0])


# In-place
A!(x) = lmul!(2,x)
B!(x) = lmul!(3,x)
C!(x) = lmul!(4,x)

function finplace(algorithms, x)
    # Allocates once
    xwork = copy(x)
    
    for alg in algorithms
        alg(xwork)
    end

    return xwork
end

println("In-place:")
@btime finplace($(A!, B!, C!, A!, B!), $[1.5, 3.0])

This prints out

Original:
373.892 ns (5 allocations: 480 bytes)
In-place:
193.564 ns (1 allocation: 96 bytes)

Perhaps you have a MWE where this assumption fails?

lmiq · November 12, 2020, 7:09pm

If you provide the exact number of functions that is much, much faster and might
not allocate anything if the output of the functions are immutable:

julia> function f2(f1,f2,f3,f4,f5,x)
         return f5(f4(f3(f2(f1(x)))))
       end

julia> @btime f2(A,B,C,A,B,1.5)
  0.024 ns (0 allocations: 0 bytes)
216.0

# original here:

julia> @btime f((A,B,C,A,B),1.5)
  135.034 ns (6 allocations: 96 bytes)
216.0

I am sure there might be a smart way to write a macro for that, but I am not smart

eliassno · November 12, 2020, 7:21pm

Preallocating would greatly benefit from @lmiq’s suggestion:

# Known number of functions

function f2(f1,f2,f3,f4,f5,x)
    return f5(f4(f3(f2(f1(x)))))
end

@btime f2($A,$B,$C,$A,$B,$[1.5, 3.0])

function f2inplace(f1,f2,f3,f4,f5,x)
    xwork = copy(x)
    f5(f4(f3(f2(f1(xwork)))))
    return xwork
end

@btime f2inplace($A!, $B!, $C!, $A!, $B!, $[1.5, 3.0])

which prints:

182.691 ns (5 allocations: 480 bytes)
42.684 ns (1 allocation: 96 bytes)

emerali · November 12, 2020, 7:23pm

Isn’t this just function composition though?

Maybe something like this would be better:

julia> function f3(algs, x)
           g = reduce(∘, reverse(algs))
           return g(x)
       end
f3 (generic function with 1 method)

julia> @benchmark f((A,B,C,A,B), 1.5)
BenchmarkTools.Trial: 
  memory estimate:  96 bytes
  allocs estimate:  6
  --------------
  minimum time:     162.050 ns (0.00% GC)
  median time:      169.211 ns (0.00% GC)
  mean time:        180.256 ns (0.62% GC)
  maximum time:     1.107 μs (74.05% GC)
  --------------
  samples:          10000
  evals/sample:     755

julia> @benchmark f3((A,B,C,A,B), 1.5)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.490 ns (0.00% GC)
  median time:      1.497 ns (0.00% GC)
  mean time:        1.534 ns (0.00% GC)
  maximum time:     22.320 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

not quite as fast as @lmiq’s suggestion, but pretty fast if you dont wanna write a macro to compose your chain of functions

though admittedly, julia might be optimizing the compositions into a single multiplication given how simple the input functions are

dpsanders · November 12, 2020, 7:28pm

In this case it’s just function composition, but the question was a general one about passing in an unknown number of functions.

dpsanders · November 12, 2020, 7:30pm

For example, calculate the minimum of f(x) for each f in the list of functions.

lmiq · November 12, 2020, 7:31pm

Good point. Effectivelly, with slightly more complicated functions, they are the same:

julia> A(x) = sin(x) ; B(x) = x^3 ; C(x) = log(x)
C (generic function with 1 method)

julia> @btime f2(A,B,C,A,B,1.5)
  9.885 ns (0 allocations: 0 bytes)
-4.260054977425002e-7

julia> @btime f3((A,B,C,A,B),1.5)
  9.874 ns (0 allocations: 0 bytes)
-4.260054977425002e-7

lmiq · November 12, 2020, 7:35pm

Why composition would not work for this? I am interested in the problem and I may not be understanding exactly the case.

If evaluating each f is very costly, like solving an optimization problem, I guess that any overhead of the calling sequence will be irrelevant, no?

dpsanders · November 12, 2020, 7:39pm

Something like

sjulia> A(x) = sin(x) ; B(x) = x^3 ; C(x) = log(x)
C (generic function with 1 method)

julia> function my_calculation(algorithms, x)
           return minimum(f(x) for f in algorithms)
       end
my_calculation (generic function with 1 method)

julia> my_calculation( (A, B, C), 10)
-0.5440211108893698

In my application the functions are relatively simple, but will be called many times.
You may be right that the overhead is probably not relevant. I guess I was mainly worried about type stability.

lmiq · November 12, 2020, 7:42pm

Yes, I see that. The composition option appears to be type stable.

julia> @code_warntype f3((A,B,C,A,B),1.5)
Variables
  #self#::Core.Compiler.Const(f3, false)
  algs::Core.Compiler.Const((A, B, C, A, B), false)
  x::Float64
  g::Base.var"#62#63"{Base.var"#62#63"{Base.var"#62#63"{Base.var"#62#63"{typeof(B),typeof(A)},typeof(C)},typeof(B)},typeof(A)}

Body::Float64
1 ─ %1 = Main.reverse(algs)::Core.Compiler.Const((B, A, C, B, A), false)
│        (g = Main.reduce(Main.:∘, %1))
│   %3 = (g)(x)::Float64
└──      return %3

emerali · November 12, 2020, 7:42pm

if you’re passing in an N-tuple of functions, you could likely write a generated function that unrolls the loop over N, though i personally don’t understand generated functions well enough to be able to help write something like that.

eliassno · November 12, 2020, 7:43pm

Is x always a Number in your application?

lmiq · November 12, 2020, 7:46pm

somewhat offtopic, but if you optimizing a function which assumes the minimum value of a set of functions at each point of the domain, you might be interested in Low Order-Value Optimization and applications | SpringerLink

ranocha · November 13, 2020, 6:06am

If you use tuples and the approach above, type stability will depend on some
cutoffs determined by the length of the tuples. For me, this cutoff is at
four elements in algorithms.

julia> using BenchmarkTools

julia> A(x) = sin(x) ; B(x) = x^3 ; C(x) = log(x); D(x) = 2x; E(x) = exp(x);

julia> function my_calculation(algorithms, x)
           return minimum(f(x) for f in algorithms)
       end

julia> @benchmark my_calculation( $(Ref((A, B, C)))[], $(Ref(10))[] )
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     22.688 ns (0.00% GC)
  median time:      22.724 ns (0.00% GC)
  mean time:        22.989 ns (0.00% GC)
  maximum time:     65.425 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     996

julia> @benchmark my_calculation( $(Ref((A, B, C, D, E)))[], $(Ref(10))[] )
BenchmarkTools.Trial:
  memory estimate:  288 bytes
  allocs estimate:  18
  --------------
  minimum time:     571.109 ns (0.00% GC)
  median time:      580.388 ns (0.00% GC)
  mean time:        609.541 ns (1.02% GC)
  maximum time:     7.012 μs (89.64% GC)
  --------------
  samples:          10000
  evals/sample:     183

To get type stability for more elements, you can use “lispy tuple programming”
similar to julia - Type stability for lists of closures - Stack Overflow.

julia> function my_calculation_lispy(algorithms::NTuple{N,Any}, x) where {N}
           alg = first(algorithms)
           remaining_algorithms = Base.tail(algorithms)
           return min(alg(x), my_calculation_lispy(remaining_algorithms, x))
       end

julia> function my_calculation_lispy(algorithms::Tuple{Any}, x)
           return first(algorithms)(x)
       end

julia> my_calculation_lispy( (A, B, C, D, E), 10) ≈ my_calculation( (A, B, C, D, E), 10)
true

julia> @benchmark my_calculation_lispy( $(Ref((A, B, C)))[], $(Ref(10))[] )
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     22.466 ns (0.00% GC)
  median time:      22.520 ns (0.00% GC)
  mean time:        22.991 ns (0.00% GC)
  maximum time:     56.554 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     996

julia> @benchmark my_calculation_lispy( $(Ref((A, B, C, D, E)))[], $(Ref(10))[] )
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     42.918 ns (0.00% GC)
  median time:      43.007 ns (0.00% GC)
  mean time:        44.077 ns (0.00% GC)
  maximum time:     123.421 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     990

Topic		Replies	Views
Help understanding when .= allocates General Usage memory-allocation	9	846	September 10, 2022
Reduce memory allocated in array view and in place sum Performance question	12	685	November 10, 2023
Julian way of preallocating arrays New to Julia question	2	482	May 26, 2022
Trying to understand memory usage General Usage	7	2963	June 14, 2019
Avoiding unnecessary mallocs while building then solving a linear system inside a loop can only be achieved by routines that modify in-place? Performance memory-allocation	2	379	September 27, 2019

Specifying a set of algorithms

Related topics