Allocations when manipulating an iterator


I distilled a significant performance issue in JuMP 0.19-alpha ( to the following standalone example:

using BenchmarkTools

struct FlippedDictIterator{K,V}

reorder_iterator(::Nothing) = nothing
reorder_iterator(p::Pair, state::Int) = ((p.second, p.first), state)
reorder_iterator(x) = reorder_iterator(x...)

Base.iterate(fi::FlippedDictIterator) = reorder_iterator(iterate(fi.d))
Base.iterate(fi::FlippedDictIterator, state) = reorder_iterator(iterate(fi.d, state))

# The behavior isn't affected by these 4 lines, but I thought they could help.
Base.IteratorSize(fi::FlippedDictIterator) = Base.IteratorSize(fi.d)
Base.length(fi::FlippedDictIterator) = length(fi.d)
Base.IteratorEltype(fi::FlippedDictIterator) = Base.IteratorEltype(fi.d)
Base.eltype(fi::FlippedDictIterator{K,V}) where {K,V} = Tuple{V, K} 

function loop1(d)
    s = 0
    for (key, val) in d
        s += key * length(val)
    return s

function loop2(d)
    s = 0
    for (val, key) in FlippedDictIterator(d)
        s += key * length(val)
    return s

d = Dict{Int, String}()
for i in 1:1000
    d[i] = "test"

@show loop1(d)
@show loop2(d)

@btime loop1($d)
@btime loop2($d)

Yields (on Julia 1.0.1):

loop1(d) = 2002000
loop2(d) = 2002000
  14.323 μs (0 allocations: 0 bytes)
  20.913 μs (2000 allocations: 62.50 KiB)

Why are there two allocations per call to iterate(fi::FlippedDictIterator, state)?


The optimiser has to chase nothing through too many levels: try

julia> Base.iterate(fi::FlippedDictIterator, state) = let ϕ = iterate(fi.d, state); if !( ϕ === nothing ) return reverse(ϕ[1]), ϕ[2] else return nothing end end

julia> Base.iterate(fi::FlippedDictIterator) = let ϕ = iterate(fi.d); if !( ϕ === nothing ) return reverse(ϕ[1]), ϕ[2] else return nothing end end


That worked. Thanks!


Can you please describe how you found this out? It would be useful to learn to do this systematically.



Uh, unfortunately that was more intuition than science, but the gist is to bail early when an iterator returns nothing and not to delay handling of nothings to later instances.


Currently, my way of dealing with these things is to make small functions, stare at @code_warntype, and try patterns I have seen in Base; I was hoping I would learn something more organized :wink: