How can you dispatch on Iterator eltype including generators?


#1

Ho can you dispatch on Iterator eltype including generators? I ask in an attempt to fix the following interface problem.

Right now I am working on the get function of an interface that has many different types to dispatch, and potentially a lot of over head for each get initially. Something like:

struct InterfaceIndirect    index::Int64 end
struct ApplicationIndirect    ref::InterfaceIndirect end

function interface_get(indirect::InterfaceIndirect)
    #lots of prework which makes it impractical to broadcast to this function
    prework=rand(100_000_000) 
    return prework[indirect.index]
end

function interface_get(indirect::AnotherInterfaceIndirect)
    #same kind of thing as above but different
end

function interface_get(v_indirect::Vector{InterfaceIndirect})
    #lots of prework
    prework=rand(100_000_000)
    return [ prework[indirect.index] for indirect in v_ind ]
end

function interface_get(v_indirect::Vector{AnotherInterfaceIndirect})
    #ditto
end

function application_get(app_indirect::ApplicationIndirect)
    interface_get(app_indirect.ref)
end

function application_get(v_app_indirect::Vector{ApplicationIndirect})
    ###This temporary array sucks####
    v_indirect = [app_indirect.ref for app_indirect in v_app_indirect]
    interface_get(v_indirect)
end

The temporary array for the application vector indirection seems sub-optimal

Now with that iteration and generators are so fast it would be great if I could instead to something like:

 interface_get(iter::IT) where IT =  _interface_get(eltype(IT),iter)

function _interface_get(::Type{T}, iter) where T <: InterfaceIndirect
    #lots of prework
    prework=rand(100_000_000)
    return [ prework[indirect.index] for indirect in iter ]
end

function application_get(v_app_indirect::Vector{ApplicationIndirect})
    #No more temporary array
    interface_get(app_indirect.ref for app_indirect in v_app_indirect)
end

But unfortunately Generators do not have an eltype, so that implementation can not be used.
As I recently inquired about it:

So I ask what is the implementation pattern that gets around this? Or is there an alternative way to achieve this.


Should Generators finally be given eltype
#2

Good question! First, I noticed that the prework calculation and indexing is duplicated in the scalar and vector methods in your example. If this logic is indeed repeated in your actual code, a good first step would be to remove the duplication by creating the following methods:

function prework()
    # prework goes here
    rand(100_000)
end

function lookup(pre, idx)
    # get logic goes here (assuming it's more than just an index lookup)
    pre[idx]
end

Now onto the dispatch question. I would recommend creating generic map versions of the scalar and vector getters, that accept a function that generates the lookup index (or whatever is needed to access the prework):

function map_get(idx_getter::Function, e)
    pre = prework()
    lookup(pre, idx_getter(e))
end

function map_get(idx_getter::Function, v::AbstractVector)
    pre = prework()
    [lookup(pre, idx_getter(e)) for e in v]
end

With this created, it’s simply a matter of implementing an index getter and forward to map_get in order to create the get functionality for each type:

interface_get(indirect) = map_get(e -> e.index, indirect)

application_get(app_indirect) = map_get(e -> e.ref.index, app_indirect)

Note in particular that you no longer need to create separate methods for scalars and vectors, it’s handled by map_get. And if you add other containers, you’ll only need to add a single map version for that container.

(Note: I don’t know what your actual code looks like, but if the prework is not always the same, perhaps you don’t want to generate it in the map functions, but in the callers, and pass it to map_get. Or pass it as a function, like idx_getter.)

All of this should be type stable and fairly performant:

julia> code_warntype(interface_get, (InterfaceIndirect,))
Body::Float64

julia> code_warntype(interface_get, (Vector{InterfaceIndirect},))
Body::Array{Float64,1}

julia> code_warntype(application_get, (ApplicationIndirect,))
Body::Float64

julia> code_warntype(application_get, (Vector{ApplicationIndirect},))
Body::Array{Float64,1}
Click here for full code, plus a test.
using BenchmarkTools

struct InterfaceIndirect index::Int64 end
struct ApplicationIndirect ref::InterfaceIndirect end

function prework()
    # prework goes here
    rand(100_000)
end

function lookup(pre, idx)
    # get logic goes here (assuming it's more than just an index lookup)
    pre[idx]
end

function map_get(idx_getter::Function, e)
    pre = prework()
    lookup(pre, idx_getter(e))
end

function map_get(idx_getter::Function, v::AbstractVector)
    pre = prework()
    [lookup(pre, idx_getter(e)) for e in v]
end

interface_get(indirect) = map_get(e -> e.index, indirect)

application_get(app_indirect) = map_get(e -> e.ref.index, app_indirect)

function timeit()
    v = 1:10000 .|> n -> InterfaceIndirect(n)
    @btime interface_get($v[1])
    @btime interface_get($v)
    v = 1:10000 .|> n -> ApplicationIndirect(InterfaceIndirect(n))
    @btime application_get($v[1])
    @btime application_get($v)
    nothing
end

timeit()

#3

(How does one get the “unfold for full code” to work?)


#4

Just click on that line! Edited to clarify.


#5

I mean what did you do so that happens to appear in discourse and displays as it does?


#6

When you compose a post, there’s a little cogwheel in the menu. Click on that, then click “Hide Details”.


#7

If you need to precompute something once, but hide that from the interface, you may find

useful.


#8

The one aspect to the approach, due to a requirement I may not have articulated is the the interface_get(...) method need to be-able to dispatch on multiple types in its own module and the prework that is done is dependent on the type.

So in the example that won’t work because of the lack of eltype on Generators you would see::

 interface_get(iter::IT) where IT =  _interface_get(eltype(IT),iter)

function _interface_get(::Type{InterfaceIndirect}, iter) 
    prework=preworker(::Type{InterfaceIndirect})
    return [ prework[indirect.index] for indirect in iter ]
end

function _interface_get(::Type{AnotherInterfaceIndirect}, iter) 
    prework=preworker(::Type{AnotherInterfaceIndirect})
    return [ prework[indirect.index] for indirect in iter ]
end

#9

Why can’t you just take the element / eltype of the first element and dispatch on that?

function interface_get(iter)
    prework = preworker(first(iter))
    return [prework[indirect.index] for indirect in iter]
end

If this is not sufficient, perhaps you could post a more complete MWE that compiles and illustrates your problem?


#10

Yes, that would work it just feels un-satisfying though.

Is the problem that I am trying to do this in an un-Julian way? Would broadcasting be a better approach?
Lets say I have:

index(ojb::ApplicationIndirect) = obj.ref

How could I get this to work:

function application_get(v_app_indirect::Vector{ApplicationIndirect})
    interface_get.(index.(v_app_indirect))
end

#11

I don’t see how broadcasting would be better. Your example will end up calling interface_get with single elements, so you will be calculating prework over and over again. One way to solve that is using caching / lazy initialization like in Tamas’ link above (see the Lazy struct I proposed). Then you could push the calculation of prework all the way to the lookup method (where the prework is accessed). Whether that’s better or not is hard to tell – in my experience lazy variables can make the code slightly more complex and increase the risk of errors, and the lazy indirection will also add a small overhead to the lookup (few nanoseconds if cached; probably acceptable unless the lookup is trivial like in the example above).