Summing matrices of different dimensions in a type stable way?


#1

I want to have a type (or maybe another container) that contains matrices of different dimensions and be able to do operations (like summing them), but I’m having trouble doing this in a type stable way. Something like:

    type Container
        mat#::?
    end

    function mysum(c::Container)
        s = 0.0
        for m in c.mat
            s += sum(m)
        end
        s
    end

    c = Container([rand(10,10),rand(10)])
    @code_warntype mysum(c)

Doesn’t work because [rand(10,10),rand(10)] is not homogeneous (can’t declare it as Array{Array{Float64,N},1} for a fixed N).

Using a recursive definition of mysum improves things a bit but the top function remains unstable:

    mysum2(c::Container) = mysum2((c.mat...))
    mysum2(args) = mysum2(0.0,args...)
    mysum2(out::Float64,mat::Array{Float64},rest...) = out + mysum2(sum(mat),rest...)
    mysum2(out::Float64,mat::Array{Float64}) = out + sum(mat)

    @code_warntype mysum2( (c.mat...) ) #type stable, but
    @code_warntype mysum2( c ) #type unstable

I could generate a collection of types of the sort:

    type Container2{N1,N2}
        m1::Array{Float64,N1} 
        m2::Array{Float64,N2}
    end

But that’s not very elegant.

Am I missing something obvious here ? Is there a pattern to solve this kind of issues ?


#2

Have you tried using a tuple, possibly combined with the recursion?


#3

A tuple works fine; no need for recursion:

julia> m = (rand(3), rand(3,3), rand(3,3,3))
([0.175138,0.460267,0.13993],
[0.751967 0.93118 0.102488; 0.967268 0.6361 0.0838402; 0.674022 0.343599 0.899127],

[0.478299 0.752114 0.657615; 0.172727 0.6258 0.781415; 0.202425 0.200734 0.69294]

[0.39174 0.776263 0.504116; 0.309226 0.536818 0.383052; 0.0128251 0.867464 0.521238]

[0.755321 0.856696 0.95617; 0.762188 0.849155 0.57642; 0.611566 0.281467 0.462158])

julia> f(m) = sum(sum, m)
f (generic function with 1 method)

julia> @code_warntype f(m)
Variables:
  #self#::#f
  m::Tuple{Array{Float64,1},Array{Float64,2},Array{Float64,3}}

Body:
  begin 
      return $(Expr(:invoke, LambdaInfo for mapfoldl(::Base.#sum, ::Function, ::Tuple{Array{Float64,1},Array{Float64,2},Array{Float64,3}}), :(Base.mapfoldl), :(Main.sum), :(Base.+), :(m)))
  end::Float64

#4

Ok, so the trick is keep the tuple all the way up. I was still trying to put it in a container.

I have to say having to work with big tuples instead of types that show nicely is not very elegant either. Is there a better solution on the roadmap ?


#5

You can just define your own show method for your type to display however you would like.


#6

Yes but we are speaking about using tuples instead of custom types here. I mean you can overwrite show for some tuples of arrays, but that’s not exactly ideal.

When you try to write something a bit more involved than sum it also quickly gets complicated (admittedly I’m not used to work with tuples, maybe it gets easier with practice).


#7

You can wrap the tuple in a type like so:

immutable MyType{T<:Tuple}
    tup::T
end

(note that the type parameter is needed.) You can then define a custom show method on the MyType (and for instance also implement the array interface)


#8

Yeah, the T<:Tuple was what I was missing, I think that will solve most of my issues. Thanks.


#9

If your real-world data is going to have dimensions larger than 10 by 10, I would ignore type stability for summing over the container and just use a Vector{Array{Float64}}, then assert the result type if I know it to prevent type instability from propagating. For large data, the cost of dynamic dispatch to vectorized container operations is probably negligible.