Invokelatest() makes code run slower


It seems that when I upgraded the Fatou to julia 0.6 from 0.5, I needed to work around the new world counting feature in the julia language. It was possible to make Fatou functional in 0.6 by using the new invokelatest() function in the places where it was needed to run newly generated code.

However, this change significantly affects the computing time required by 2x.

This can be easily demonstrated by running the Pkg.test("Fatou") command:

In julia 0.6

julia> Pkg.test("Fatou")
INFO: Testing Fatou
Fatou detected 4 julia threads.
  0.680028 seconds (1.27 M allocations: 45.425 MiB, 4.22% gc time)
  0.325977 seconds (1.17 M allocations: 34.818 MiB, 2.54% gc time)
  0.348296 seconds (1.17 M allocations: 35.009 MiB, 4.45% gc time)
INFO: Fatou tests passed

However, in julia 0.5.2 the code is able to run faster:

julia> Pkg.test("Fatou")
INFO: Testing Fatou
Fatou detected 4 julia threads.
  0.313702 seconds (566.66 k allocations: 15.841 MB, 3.38% gc time)
  0.165542 seconds (913.34 k allocations: 24.549 MB, 4.70% gc time)
  0.264221 seconds (629.58 k allocations: 17.135 MB, 3.39% gc time)
INFO: Fatou tests passed

Is this a natural consequence of the world counting and invokelatest features? Is this something for which the performance can be improved still within the julia langauge? Is there something I am overlooking on my end that would help me speed it up again?




So there is no way that the julia language could accept more specific information (theoretically) about the types in the new function, in order to speed up performance of such a function call? The programmer might know this information, the julia language would just have to be able to accept that information in order to anticipate the types correctly.


You could manually add type-assertions that allow Julia to make stronger assumptions about what the new versions of the function may return.


What source files in the julia code base would I need to look at if I wanted to try to implement this?


Your own code?


Also why do you claim it’s the invokelatest that’s causing the issue?


As @mbauman pointed out, using the invokelatest feature in this way does cause the code to slow down, since the type information cannot be inferred. I’d like to fix this in my own code, but as far as I know, the julia language does not yet let me provide the required type information when calling the invokelatest function. So if julia can’t provide this feature yet, someone needs to implement it, right?


Well, it’ll slow down compare to if you didn’t do runtime code generation. It won’t be slower than what you would otherwise get on <=0.5

It does by


Have a look here at my source code: src/Fatou.jl

(sym2fun(invokelatest(K.Q,Sym(:a),Sym(:b)),:(Complex{Float64})) |> eval)::Function

I definitely provide the type information there Complex{Float64}, which is an argument to my sym2fun equation that generates the code that gets evaluated.

Using sym2fun defined in src/internals.jl I build the function expression that I need to run, it accepts a SymPy.Sym and a type as an argument

sym2fun(expr,typ) = Expr(:function, Expr(:call, gensym(),
        map(s->Expr(:(::),s,typ),sort!(Symbol.(free_symbols(expr))))..., Expr(:(...),:zargs)),

The argument I feed into this is invokelatest(K.Q,Sym(:a),Sym(:b)), which plugs SymPy symbols into the arguments of K.Q so that the julia expression can be constructed using the correct type information.

The function h defined like this is then used in another function called nf

function nf(z0::Complex{Float64})::Tuple{UInt8,Complex{Float64}}
        K.mandel ? (z = K.seed): (z = z0); zn = 0x00
        while (K.newt ? (h(z,z0)::Float64>K.ϵ)::Bool : (h(z,z0)::Float64<K.ϵ))::Bool && K.N>zn
            z = f(z,z0)::Complex{Float64}; zn+=0x01
        end; #end
        # return the normalized argument of z or iteration count
        return (zn::UInt8,z::Complex{Float64})::Tuple{UInt8,Complex{Float64}}

Then I evaluate this function and use it in the main loop

@time @threads for j = 1:length(y); for k = 1:length(x);
    (matU[j,k],matF[j,k]) = invokelatest(nf,Z[j,k]); end; end

As you can see, I have provided the necessary type information, which resulted in fast code in julia 0.5, however is slower by merely introducing invokelatest into the code for julia 0.6.

How am I supposed to provide this type information in julia 0.6 then, if it is possible?


I know this is unintuitive but the type information you are providing is useless. The compiler can figure out those perfectly fine (assuming the code are properly generated). Even if compiler couldn’t figure out these info by itself, these uses of type assertions are as useful in 0.6+ with invokelatest as they where in <=0.5.

The type info that the compiler can’t figure out (and it can’t in either case) is the return type of the invokelatest. You just need a type assert there. There should be no other difference.

That’s why I asked if you know the issue is caused by invokelatest since that shouldn’t be the obvious issue.


Yes, that makes sense, thanks for clarifying. Previously, having the type assertion at the return value of the function was sufficient to provide the output type. Now it was necessary to assert it in the function call as well:

(matU[j,k],matF[j,k]) = invokelatest(nf,Z[j,k])::Tuple{UInt8,Complex{Float64}}

and this change now returns the performance to be approximately equal to the 0.5 performance.

julia> Pkg.test("Fatou")
INFO: Testing Fatou
Fatou detected 4 julia threads.
  0.397973 seconds (873.64 k allocations: 26.022 MiB, 2.88% gc time)
  0.180062 seconds (1.01 M allocations: 27.105 MiB, 4.29% gc time)
  0.185982 seconds (984.80 k allocations: 26.750 MiB, 4.13% gc time)
INFO: Fatou tests passed

Also, I realize that some of my extra type assertions are useless, but it also helps me as a programmer to precisely think through the data flow of my program, so it doesn’t hurt to have it in there.

Well, that does solve that issue then.


It’s mostly personal tastes but I should say that having extra type assertion sometimes makes the code hard to read simply because they are distracting.
That said, you know how to write code that you can most easily read and yes those doesn’t hurt as far as runtime performance is concerned (the compiler will need to optimize them out but that’s usually a pretty cheap transformation).


Gotcha, yea it’s not my preferred taste either for most programming I typically do. Could you elaborate on why the compiler needs to optimize the unecessary type assertions out?


All constructs in the code has a defined behavior and so the compiler/runtime need to do exactly that. Of course if the compiler can predict what it does it can maybe replace it with something simpler. Basically the more complicated the code you write, the more work the compiler has to do to reduce it, nothing more complicated. It’s not really relavant and I’m just saying that 0 runtime cost != 0 cost.