Performance optimizations with runtime dispatch

aramend · May 30, 2024, 8:01pm

I studied your code some more and wrote a way to get the polish notation into the type system as well. The difference is that I can also maintain the output of every node. The overall weakness of these approaches seems to be a quickly growing compile time, 16 nodes already take 70ms on my machine just for the compilation, which mostly offsets the crazy speeds you can get later.

The code:

struct ProgEnd end
struct Term{next,value,ix} end
struct Variable{next,name,ix} end
struct Unary{next,op,ix,childix} end
struct Binary{next,op,ix,lchildix,rchildix} end


# binary tree:     exp(x) + y
# polish notation: + exp x y
pend = ProgEnd()
p = Binary{pend,+,1,2,4}()
e = Unary{p,exp,2,3}()
v2 = Variable{e,:y,4}()
v1 = Variable{v2,:x,3}()

function evalexpr!(::Vector{Float32}, ::ProgEnd, subs)
end

function evalexpr!(fwds::Vector{Float32}, ::Binary{next,op,ix,lix,rix}, subs) where {next,op,ix,lix,rix}
    @inbounds fwds[ix] = op(fwds[lix], fwds[rix])
    evalexpr!(fwds, next, subs)
end

function evalexpr!(fwds::Vector{Float32}, ::Unary{next,op,ix,lix}, subs) where {next,op,ix,lix}
    @inbounds fwds[ix] = op(fwds[lix])
    evalexpr!(fwds, next, subs)
end

function evalexpr!(fwds::Vector{Float32}, ::Term{next,value,ix}, subs) where {next,value,ix,}
    @inbounds fwds[ix] = value
    evalexpr!(fwds, next, subs)
end

function evalexpr!(fwds::Vector{Float32}, ::Variable{next,name,ix}, subs::NT) where {next,name,ix,NT<:NamedTuple}
    @inbounds fwds[ix] = subs[name]
    evalexpr!(fwds, next, subs)
end

fwds = zeros(Float32, 4)
data = (; x=2, y=3)
evalexpr!(fwds, v1, data)
# on my machine, 3ms to compile, 10ns to evaluate

It’s easy to get AD in there as well, but that adds another 100ms to compilation times! For now, I think this is as good as it gets. I’d love to know more julia internals, the compiled output for math expressions is so simple that I feel I should be able to jump in at some levels lower than julia code, and in that manner avoid some of the compilation overhead.

Topic		Replies	Views
Why is this code run-time dispatch/slow? Performance	16	889	September 25, 2020
Performance discrepancy for different ways of piping/composing functions for handling runtime dispatch Performance function , runtime-dispatch	9	196	June 13, 2025
How slow is runtime dispatch, anyway? Benchmark attempts Performance	8	330	January 15, 2025
Dangers of abusing multiple dispatch: missing optimization? General Usage	1	612	April 14, 2017
Dispatch on argument values rather than (just) types? Performance	5	2910	December 26, 2017

Performance optimizations with runtime dispatch

Related topics