Type instability in a simple function

I try as much as possible to remove type instabilities, as suggested in Julia’s Performance Tips.
I wasn’t able to find the trick on this function, although it’s quite simple.

function parameters()
    TARGET = Int[88, 77, 65, 83]
    asc, desc, static = 4:7, 4:-1:1, fill(4, 4)
    partial_indices = Iterators.take(Iterators.product((asc, desc, static), (asc, desc, static)), 8)
    INDICES = [CartesianIndex.(collect(zip(x, y))) for (x, y) in partial_indices]
    return INDICES
end

I believe the culprit is around the zip function, I’ve tried a few alternatives with no avail.

Probably a Julia wizard will spot the issue much faster than myself :slight_smile:

If I remember correctly, these are all different types, so Iterators.product might end up having a weirdly difficult time here.

1 Like

They are of different types:

julia> typeof(4:7)
UnitRange{Int64}

julia> typeof(fill(4,4))
Vector{Int64} (alias for Array{Int64, 1})

julia> typeof(4:-1:1)
StepRange{Int64, Int64}

However, if I remove the last line and only collect Iterators.product, there is no longer any type instabilities. So it seems the compiler is able to solve this issue.

Try to replace zip by a map with multiple collections:

map(f, x, y)

It tends to have better type inference properties.

1 Like

I think any loop which handles these different-type vectors in succession will be type-unstable. You can still speed it up by being more explicit, and IMO it’s much more readable to write for loops than this nested stack of iterators & broadcasting.

But if the loop is over exactly 8 things, you can also just write them all out. Again much more readable, hardly more code. This is type-stable and much quicker. (You could generate the same code via metaprogramming, e.g. with Base.Cartesian.@nexprs or some @unroll package, but for 8 things it’s unlikely your reader will thank you.)

function loop_param()  # Not type-stable, but quicker and much more readable
    asc = 4:7
    desc = 4:-1:1
    static = fill(4, 4)
    # out = [Tuple{Int,Int}[] for _ in 1:8]  # option 1
    out = [[(0,0) for _ in 1:4] for _ in 1:8]  # option 2
    i = 1
    for coll1 in (asc, desc, static), coll2 in (asc, desc, static)
        # append!(out[i], zip(coll1, coll2))  # option 1
        copyto!(out[i], zip(coll1, coll2))  # option 2
        i += 1
        i > length(out) && break
    end
    out
end

function noloop_param()  # This is type-stable, and much quicker
    asc = 4:7
    desc = 4:-1:1
    static = fill(4, 4)
    [
      collect(zip(asc, asc)),
      collect(zip(asc, desc)),
      collect(zip(asc, static)),
      collect(zip(desc, asc)),
      collect(zip(desc, desc)),
      collect(zip(desc, static)),
      collect(zip(static, asc)),
      collect(zip(static, desc)),  # only 8 things. I wasn't careful about the order
    ]
end
1 Like

The compiler (and type inference calls) actually manages to infer the concrete types of everything prior to INDICES = and the 1st element’s type. If that’s all collect(itr:Generator) needed to start filling a vector, then we’d have type stability. However, it also branches for empty vectors and more poorly inferrable comprehensions’ element type-widening, and those end up with different inferred vector types. The compiler may infer the same type or statically avoid those branches if partial_indices wasn’t mixing a bunch of different Int vector types; try asc, desc, static = collect.((4:7, 4:-1:1, fill(4, 4) )) to see type stability.

The 2nd (x,y) onwards is inferred as the abstract Tuple{AbstractVector{Int64}, AbstractVector{Int64}}, and an abstract input with concrete element types is ironically not enough to infer the elements of zip(x,y) are Tuple{Int64, Int64}. I can’t rule out the possibility of that changing with some zip helper methods (start here) matching static parameters to those concrete element types, but I also don’t know if it’s strictly valid or feasible to assume iterate-ing AbstractVector{T} provides elements of T or eltype(...). A counterexample is trivial to implement, and code_warntype(iterate, (AbstractVector{Int},)) shows that assumption is not implemented.

Technically you don’t need type stability here for good performance. parameters() has no inputs and appears not to use any global state, so you can just call it once and cache it, whether directly in the source code or in a file. A properly typed global variable would confer downstream type stability. In the more general case with inputs, function barriers or manual type assertions can isolate the type instability from performance-critical code (note that simply specifying the output array’s type does not evade the type inference problems described above). If the comprehension element expression is part of the critical code, that’s where you’ll want barriers or assertions; I also expect loopier refactoring could help isolate the zip step and cut down on allocations.

2 Likes