Codegen woes

cscherrer · August 2, 2019, 4:30pm

Nice, I hadn’t seen FunctionWrappers before. A quick google search leads to this package, last updated around a year ago. Is this the right place to get it?

Any comparisons with @ChrisRackauckas’s PR in terms of use cases, predictable performance, etc?

kristoffer.carlsson · August 2, 2019, 6:21pm

Yep.

They both kinda do the same “trick” with cfunction and ccall. I haven’t done any real comparisons so can’t comment on that.

cscherrer · August 2, 2019, 8:08pm

This has been a difficulty anyway. Two things have helped work around this:

I’m only using eval to build a function, so there’s some encapsulation of the scope anyway
Any user dependencies are required to be passed in as arguments. Unfortunately, there’s currently no possibility of closures. This is a little bit of an annoyance, but makes reasoning about things statically much easier.

I have sometimes used a trick of passing around a ctx (for “context”) mapping, usually implemented as a Dict or NamedTuple. This lets you fake global scope without having to do anything too fancy. I might at some point make this accessible to users, though I’m not sure yet what that interface would look like.

chakravala · August 3, 2019, 12:02am

@ChrisRackauckas told me at JuliaCon that he doesn’t like FunctionWrapper, but it wasn’t clear to me what his issue with it was. Please elaborate, since I am going to implement something similar for Leibniz.jl this is an issue that affects me also.

ChrisRackauckas · August 3, 2019, 12:27am

FunctionWrappers require that you put a type wrapper for the inputs and outputs. They are fine but a bit difficult to use in some cases, and they do have a noticeable performance overhead vs a standard call (though this is much less in 1.0+ than it was in v0.6). It think it’s easier to just invoke in a way where you say what the return is than to wrap every function in a type. I wonder if there is an overhead difference.

cortner · August 3, 2019, 5:52am

I’d be very interested in a benchmark on this.

cscherrer · August 27, 2019, 1:11am

I think @thautwarm is very close to solving this problem once and for all with GG.jl.

Here’s an example in Soss which uses GG:

julia> m = @model begin
       μ ~ Normal()
       x ~ Normal(μ,1)
       end
@model begin
        μ ~ Normal()
        x ~ Normal(μ, 1)
    end

julia> @btime rand(m)
  24.240 ns (1 allocation: 32 bytes)
(μ = 1.5394913244089232, x = 0.53374410189993)

It’s even faster if the model is const (still figuring out how to avoid the need for this):

julia> const m2 = m
@model begin
        μ ~ Normal()
        x ~ Normal(μ, 1)
    end

julia> @btime rand(m2)
  10.610 ns (0 allocations: 0 bytes)
(μ = 2.0368121003101187, x = 0.06231985426467612)

Here’s the hand-written version for comparison:

julia> function f()
       μ = rand(Normal())
       x = rand(Normal(μ,1.0))
       (μ=μ,x=x)
       end
f (generic function with 1 method)

julia> @btime f()
  10.439 ns (0 allocations: 0 bytes)
(μ = -0.5998600215661466, x = -1.0367247568633182)

After some setup, the implementation can be very short:

export rand
@generated function rand(m::Model{T} where T) 
    type2model(m) |> sourceRand
end

where sourceRand generates what you’d expect,

julia> sourceRand(m)
quote
    μ = rand(Normal())
    x = rand(Normal(μ, 1))
    (μ = μ, x = x)
end

dawbarton · August 27, 2019, 9:12am

I’d guess that the const issue is just a benchmarking problem - does

@btime rand($m)

give you the faster timings?

cscherrer · August 27, 2019, 1:38pm

You may be right. I’m in the middle of other makes some otrher changes now, but IIRC that does fix it. But having a model defined at the global level is a common use case, and generated functions like this will be in inner loops, so I need to be sure I understand what’s going on and that there won’t be a real performance hit.

dawbarton · August 27, 2019, 2:14pm

I don’t think you need to worry about things like that; you take a hit on the first function call because of the dynamic dispatch (the compiler doesn’t know what type the global variable is) but after that it doesn’t matter. Moreover, the dynamic dispatch is (largely?) a constant overhead and so, though your example has a x2 slowdown it’s actually a constant overhead rather than a multiplier. This can have a big impact on benchmarking small functions but that’s largely it.

A quick example

julia> a = 0.5
0.5

julia> g(a) = cos(a)
g (generic function with 1 method)

julia> h(a) = (cos(a), cos(2a))  # twice the work
h (generic function with 1 method)

julia> @btime g(a)
  23.247 ns (1 allocation: 16 bytes)
0.8775825618903728

julia> @btime g($a)
  7.216 ns (0 allocations: 0 bytes)
0.8775825618903728

julia> @btime h(a)  # twice the work but not twice the time; more like time of g(a) + g($a)
  31.720 ns (1 allocation: 32 bytes)
(0.8775825618903728, 0.5403023058681398)

cscherrer · August 27, 2019, 2:16pm

This is a great example, and really helpful. Thank you!

Topic		Replies	Views
Expensive Code Generation for Functions for Reuse General Usage metaprogramming	3	119	August 6, 2024
A method to remove the use of runtime eval and invokelatest, as well as support closures in generated functions! Come with an implemented proptype Package Announcements metaprogramming	19	2740	September 4, 2019
Generate package code: string processing or metaprogramming? General Usage	4	477	November 26, 2020
Recommended ways to work around world age issues in v0.6? Internals & Design	3	980	January 27, 2017
[ANN] New tools for fixing performance problems in Julia General Usage announcement	3	2906	March 17, 2023

Codegen woes

Related topics