Minimal example for highlighting type instability's impact on performance

Hey all,

I am updating some old material from a Julia course I’ve given in the past (Zero2Hero intensive Julia workshop - YouTube) . Due to the huge progress of Julia release after release, I am struggling to find a simple example that would highlight type instability and the impact it has on performance.

I used to use this function:

@noinline function mymax(init, x)
    s = init
    for xi in x
        s = s < xi ? xi : s
    end
    return s
end

# benchmark it
using BenchmarkTools: @btime
r = rand(Int, 1000);
@btime mymax(-Inf32, $r); #   4.486 μs (0 allocations: 0 bytes)
@btime mymax(typemin(Int), $r);  #   55.635 ns (0 allocations: 0 bytes)

however, @code_warntype no longer includes red text, only yellow. Additionally, I have the suspicion that the performance difference does not come from type instability, but from conversions between 32 and 64 bits. Because, this

@btime mymax(-Inf, $r);    781.731 ns (0 allocations: 0 bytes)

has better performance.

So, the question is, does anyone have a very simple example highlighting the problems of type isntability, that can be showcased to people just starting with Julia and only know the most basics about the Type system? Ideally something where code_warntype has red text?

julia> function mycollect(r)
           v = []
           for i in r
               push!(v, i)
           end
           v
       end
mycollect (generic function with 1 method)

julia> function mycollect2(r)
           v = eltype(r)[]
           for i in r
               push!(v, i)
           end
           v
       end
mycollect2 (generic function with 1 method)

julia> @btime mycollect(1:10_000);
  148.492 μs (9498 allocations: 474.80 KiB)

julia> @btime mycollect2(1:10_000);
  75.642 μs (9 allocations: 326.55 KiB)

Although, admittedly, the instability is a bit hidden here.

But this example has much less of a performance difference between the stable and “unstable” version of the functions that are in my example code, and the @code_warntype macro does not display any instability. I was hoping to find a minimal example that is nevertheless truly “dramatic”.

Another example:

julia> rectifier(x) = x > 1 ? x : 0
rectifier (generic function with 1 method)

julia> v = Float64[-1000:1000;];

julia> @btime map(rectifier, $v);
  12.395 μs (1001 allocations: 47.23 KiB)

julia> rectifier2(x) = x > 1 ? x : zero(x)
rectifier2 (generic function with 1 method)

julia> @btime map(rectifier2, $v);
  1.315 μs (1 allocation: 15.81 KiB)
3 Likes

Perhaps a little convoluted, but has a big performance impact:

F = [sin,cos]
function my_fun_1(x,F)
    F[1](x) + F[2](x)
end

function my_fun_2(x)
    sin(x) + cos(x)
end

@btime my_fun_1(1.0,$F) # 62.004 ns (5 allocations: 80 bytes)
@btime my_fun_2(1.0) # 1.080 ns (0 allocations: 0 bytes)
2 Likes

thanks, this is also very good!

That’s the first example that has red text in the code warntype! Seems great! But also a bit tricky because a beginner will take some time to realize that in in Julia “functions are their own types”…

1 Like

I used this example in a company-internal presentation:

function f(x)
    s = 0
    for i in eachindex(x)
        s += x[i]
    end
    return s
end

and compared feeding it with

x1 = convert(Vector{Any}, rand(Int, 1000000));
x2 = rand(Int, 1000000);

This showed two orders of magnitude difference in @btime and instructive red lines in @code_warntype f(x1).

And yes, x1 is slightly contrived, but not unreasonable.

3 Likes

Thanks. This is a good case. One way a beginner may initialize a vector of Anys that only has integers is by doing the pythonic

x = []
push!(x, number)
push!(x, number)
...
2 Likes

Just a note, but if you have a Julia course, be sure to mention that type instability doesn’t always harm performance, it can, in fact, also be used for improving performance. If you have a demanding task it might make sense to move some of the infrequently-changing data into the type domain.

An example: say you’ve got an arithmetic expression of arbitrary structure, and you know you’ll want to evaluate it millions of times while changing some numeric values, but keeping the expression tree structure constant. It could make sense to move the tree structure into the type domain, at least partially, in such a way as to allow Julia to compile efficient code for evaluating the expression. The performance difference is basically the difference between interpretation and executing machine code, which is obviously a huge win, as long as the cost of the type instability, run-time dispatch and compilation is amortized.