Programmatically generate a function akin to Python closure

Hello all,

I am implementing a step function. Current I use the following function to generate an expression and returns side expression in a generated function.

struct TimePoint
    t
    v
end

function step(x, pts::Vector{TimePoint})
    pt_first = popfirst!(pts)
    pt_last = pop!(pts)

    orig_expr = Expr(:if, :($x < $(pt_first.t)), :(return $(pt_first.v)))
    expr = orig_expr
    for pt in pts
        push!(expr.args, Expr(:elseif, :($x < $(pt.t)), :(return $(pt.v))))
        expr = expr.args[end]
    end
    push!(expr.args, :(return $(pt_last.v)))
    return orig_expr
end

I would like to have a function/macro that behave akin to Python closure in this way I can I parameterize the generated function for different step sizes and values. How should I proceed?

Is this what you want?

function makestep(ts, vs)
    # Should probably raise an error here if ts is not sorted.
    t -> let idx = findfirst(t .< ts)
        if idx == nothing
            idx = lastindex(vs)
        end
        vs[idx]
    end
end
using GLMakie
ts = cumsum(rand(10))
vs = rand(10)
step = makestep(ts, vs)
lines(step.(1:0.01:5))

4 Likes

That looks great! I run a benchmark with @generated conc and conc2 = makestep(...). The results is the following:

julia> @btime conc(10)
  0.023 ns (0 allocations: 0 bytes)
2

julia> @btime conc2(10)
  73.117 ns (2 allocations: 128 bytes)
2

It looks like the expression expansion outperforms the function version. Would there be anyway to archive it with Expr?

Side comment: Note that the fields of your TimePoint type are of type Any (default). This is a common performance gotcha which you should avoid. Either specify concrete types or use type parameters:

struct TimePoint{T, V}
    t::T
    v::V
end
1 Like

If your benchmark tells you that your function runs in less than a nanosecond, then something has gone wrong. Unless you have a 40GHz processor that somehow is able to run your function in a single cycle, or the compiler is really able to fully calculate your function at compile time, this is clearly not right.

You can look into using the Ref trick for BenchmarkTools:

Sometimes, interpolating variables into very simple expressions can give the compiler more information than you intended, causing it to β€œcheat” the benchmark by hoisting the calculation out of the benchmark code
julia> a = 1; b = 2
2

julia> @btime $a + $b
0.024 ns (0 allocations: 0 bytes)
3
As a rule of thumb, if a benchmark reports that it took less than a nanosecond to perform, this hoisting probably occured. You can avoid this by referencing and dereferencing the interpolated variables

julia> @btime $(Ref(a))[] + $(Ref(b))[]
1.277 ns (0 allocations: 0 bytes)
3

3 Likes

With increased sample sizes, the time is more reasonable now.

julia> @benchmark conc(data) setup=(data=rand(UInt8))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.572 ns … 19.083 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     1.622 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   1.719 ns Β±  0.593 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–ˆ                                                           
  β–ˆβ–β–β–β–β– ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▁
  1.68 ns        Histogram: frequency by time         1.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark conc2(data) setup=(data=rand(UInt8))
BenchmarkTools.Trial: 10000 samples with 965 evaluations.
 Range (min … max):   81.361 ns …  10.622 ΞΌs  β”Š GC (min … max):  0.00% … 98.22%
 Time  (median):      94.457 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   118.672 ns Β± 341.463 ns  β”Š GC (mean Β± Οƒ):  10.27% Β±  3.55%

  β–ˆ                                                              
  β–ˆβ–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β– ▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁▁▁▁▁▁  ▁▁▁ ▁
  117 ns           Histogram: frequency by time         83.8 ns <

 Memory estimate: 128 bytes, allocs estimate: 2.

I’m not much good at performance tuning, but I suspect the problem is those 2 allocations. That might be allocating the result of t .< ts (there are some tools for digging in to this), so maybe switching to a simple loop would be faster, something like

function makestep(ts, vs)
    function step(t)
        for idx in eachindex(ts)
            if t < ts[idx]
                return vs[idx]
            end
        end
        return vs[end]
    end
end
1 Like

On mine (version 1.6.2 on macOS 11) a return statement is needed:

function makestep(ts::Vector{T}, vs::Vector{V}) where {T, V}
    return function step(t)
        for idx in eachindex(ts)
            if t < ts[idx]
                return vs[idx]
            end
        end
        return vs[end]
    end
end

The performance has indeed improved:

julia> @benchmark conc3(data) setup=(data=rand(UInt8))
BenchmarkTools.Trial: 10000 samples with 997 evaluations.
 Range (min … max):  18.114 ns … 57.385 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     19.534 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   19.806 ns Β±  1.956 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–ˆ                                                            
  β–ˆβ–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–  ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▁
  20.6 ns         Histogram: frequency by time        18.8 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

I ended up using ConstantInterpolation from DataInterpolations.jl.

The performance is comparable.

julia> conc4 = ConstantInterpolation([1, 2, 4, 8], [10, 20, 30, 40], dir=:right)
8-element ConstantInterpolation{Vector{Int64}, Vector{Int64}, Symbol, true, Int64}:
  1
  2
  4
  8
 10
 20
 30
 40

julia> @benchmark conc4(data) setup=(data=rand(UInt8))
BenchmarkTools.Trial: 10000 samples with 997 evaluations.
 Range (min … max):  20.411 ns … 111.894 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     21.641 ns               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   23.268 ns Β±   5.258 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–…β–„β–ˆβ–‡β–…β–‚β–ƒβ–             ▁ ▁▂ ▂▁                                 β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–†β–†β–†β–ˆβ–‡β–†β–‡β–ˆβ–†β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–…β–‡β–‡β–‡β–†β–‡β–‡β–‡β–†β–‡β–‡β–‡β–‡β–†β–‡β–†β–†β–‡β–‡β–†β–„β–β–…β–…β–„β–…β–…β–…β–„β–… β–ˆ
  20.4 ns       Histogram: log(frequency) by time      47.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.