Concatenate and then unconcatenate

question

#1

Is there some nice solution for vertically concatenating arrays into one matrix, and then “unconcatenate” a similar matrix to equally sized arrays as we started with?

Something like this:

small_matrices = [rand(1,2), rand(3,2), rand(5, 2)]
large_matrix = vcat(small_matrices...)
similar_matrix = sin.(large_matrix)
small_matrices2 = magic(similar_matrix)
@assert small_matrices2 == map.(sin, small_matrices)

Background: I have a Dict{String, Matrix{Float}} (the dimensions of all the arrays are n × 2) of pixel coordinates of a tracked animal. I need to calibrate these coordinates to real world values. The calibration of the camera that recorded the trajectories of the animals was done a la calibration toolbox in matlab (waving a checkerboard in front of the camera).

Currently there is no equivalent auto-detection of the checkerboards in Julia, so I’m forced to use matlab. But because it’s matlab I kind of want to work with a single large matrix (and not multiple small matrices) for speed. Once matlab is done converting the coordinates from pixels to real-world, I want to redistribute the now calibrated coordinates back to the Dict. So I need to divide the large matrix I got back from matlab into the same shapes of small matrices I had before. The way I’m doing this now is by saving a vector of tuples with three indices: the key to the dictionary, the row index in the small matrix, the row index in the large matrix.


#2

Are small_matrices of the same size?

I have been experimenting with

eg

using NestedViews # add from repo, not registered
combine(small_matrices)

Also, there is


#3

No. They can have any number of rows (but always 2 columns).

Looks promising, I’ll try that out!

Thanks @Tamas_Papp!


#4

Hmmm, in a way, I think I’m looking for something similar to what matlab’s mat2cell does…


#5

How about the below… It’s perhaps similar to the approach you’re currently using. I wouldn’t expect there to be a much easier solution, since the presence of a dictionary makes your problem quite specific.

function package(dict)
    rows = 1
    order = [(s = size(v,1); (k, s, (rows+=s)-s)) for (k,v) = dict]
    data = similar(first(values(d)), rows-1, 2)
    for (k,s,r) = order
        data[r:r+s-1,:] .= dict[k]
    end
    order,data
end

function update!(dict, order, data)
    for (k,s,r) = order
        dict[k] .= data[r:r+s-1,:]
    end
end

function convert!(data)
    data .= sin.(data)
end

dict = Dict("a" => rand(3,2), "b" => rand(1,2), "c" => rand(5, 2))
display(dict)

order,data = package(dict)

convert!(data)

update!(dict, order, data)
display(dict)

Output:

Dict{String,Array{Float64,2}} with 3 entries:
  "c" => [0.824289 0.957146; 0.732318 0.649147; … ; 0.974205 0.127819; 0.930694 0.259031]
  "b" => [0.571219 0.913474]
  "a" => [0.213684 0.511909; 0.295163 0.996942; 0.574854 0.688305]
Dict{String,Array{Float64,2}} with 3 entries:
  "c" => [0.734065 0.817551; 0.668595 0.604507; … ; 0.827255 0.127471; 0.802035 0.256144]
  "b" => [0.540658 0.791631]
  "a" => [0.212061 0.489842; 0.290896 0.839815; 0.543713 0.635229]

#6

Yap, that’s more or less what I’m doing now. OK, I kind of hoped there would be a super-duper awesome way to do this. No worries, I got it working, but I can’t do it with out saving a collection of indices that help convert the large matrix to the small ones.


#7

Instead of splatting, you can use a more optimized reduce(vcat, small_matrices).

julia> small_matrices = [rand(1,2), rand(3,2), rand(5, 2)];

julia> @btime vcat($small_matrices...)
  115.738 ns (2 allocations: 256 bytes)
9×2 Array{Float64,2}:
 0.297374  0.448472
 0.746159  0.662536
 0.402667  0.0379419
 0.556125  0.0179574
 0.978084  0.0109701
 0.846586  0.347173
 0.336596  0.601298
 0.685396  0.159486
 0.409517  0.553826

julia> @btime reduce(vcat, $small_matrices)
  78.344 ns (1 allocation: 224 bytes)
9×2 Array{Float64,2}:
 0.297374  0.448472
 0.746159  0.662536
 0.402667  0.0379419
 0.556125  0.0179574
 0.978084  0.0109701
 0.846586  0.347173
 0.336596  0.601298
 0.685396  0.159486
 0.409517  0.553826

#8

Or a regular for loop with a pre-allocated array for much better performance (at the cost of more code):

function fast_vcat(M)
    n = mapreduce(x -> size(x,1), +, M)
    V = similar(M[1], n, 2)
    r = 1
    for m = M
        s = size(m,1)
        @inbounds for j=1:2, i=1:s
            V[r+i-1,j] = m[i,j]
        end
        r += s
    end
    V
end

(This version is hard-coded to two columns, which gives a small performance boost.) Test:

julia> @btime vcat($M...);
  140.050 ns (2 allocations: 256 bytes)

julia> @btime reduce(vcat, $M);
  105.261 ns (1 allocation: 224 bytes)

julia> @btime fast_vcat($M);
  53.180 ns (1 allocation: 224 bytes)

julia> vcat(M...) == reduce(vcat, M) == fast_vcat(M)
true

However, I have a feeling that chasing nanoseconds is pointless in this case, since the data will be passed back and forth to MATLAB and processed.


#9

Very nice. Yea, while speed is always welcomed, I was hoping there would be some macro or package out there that does the unconcatenation. I can’t seem to see how SplitApplyCombine.jl would do the trick because there is no way for the combine part to know to which indices the parts belong (the same kind of information dim1Dist,...,dimNDist convey in matlab’s mat2cell).


#10

I think CatViews.jl aims at solving this problem?


#11

Works! Thanks!