# Concatenating in comprehension

Hello,

``````[rand(10,10) rand(10,10)
rand(10,10) rand(10,10)]
``````

Makes a 20 x 20 Matrix. But

``````[rand(10,10) for i in 1:2, j in 1:2]
``````

makes a `2×2 Array{Array{Float64,2},2}`.

I would like to know if there exists a comprehension syntax that would output the same as the first example. That way I can cat as many matrices as I want. Of course I use random matrices for the MWE but using rand(20,20) is not what I’m looking for.

Many thanks.

1 Like

You can write `hvcat(ntuple(_->2, 2), (rand(10,10) for i in 1:2, j in 1:2)...)` but that’s not so pretty. `reduce(hcat, [rand(10,10) for i in 1:2, j in 1:2])` makes a 10×40 Matrix. I’d like `reduce(cat, [rand(10,10) for i in 1:2, j in 1:2])` to make a 10×10×2×2 Array, but it doesn’t yet.

1 Like

You can do

``````reduce(vcat, reduce(hcat, rand(10,10) for i=1:2) for j=1:2)
``````

This allocates intermediate results, which you may be able to avoid with lazy `hcat` from `LazyArrays.jl` if it matters.

3 Likes

Since this is why I am adopting Julia, and we are never sure about how clear is that for people coming from other backgrounds (particularly I have that experience with students coming from python), I would like to point that you could also write your own function. For example, this would do the trick:

``````julia> function m(gen,n,m,N)
M = Matrix{Float64}(undef,N*n,N*m)
for i in 1:N
kn = (i-1)*n+1
for j in 1:N
km = (j-1)*m+1
M[kn:kn+n-1,km:km+m-1] .= gen(m,n)
end
end
M
end
m (generic function with 1 method)

``````

where `gen` is the function that generates each sub-matrix, (in your mwe `rand`), `n` and `m` are the dimensions of that submatrix, and `N` is the total number of repetitions. For `N` relatively larger than 2 (I did not benchmark every case), this function is faster than the compact alternatives:

``````julia> @btime m(rand,10,10,100);
4.568 ms (10002 allocations: 16.17 MiB)

julia> @btime reduce(vcat, reduce(hcat, rand(10,10) for i=1:100) for j=1:100);
108.475 ms (28098 allocations: 779.97 MiB)

``````

and, in addition, it can be easily parallelized:

``````julia> function m(gen,n,m,N)
M = Matrix{Float64}(undef,N*n,N*m)
kn = (i-1)*n+1
for j in 1:N
km = (j-1)*m+1
M[kn:kn+n-1,km:km+m-1] .= gen(m,n)
end
end
M
end

julia> @btime m(rand,10,10,100);
1.500 ms (10023 allocations: 16.18 MiB)

4

``````
9 Likes

Good point. You can also use `reshape` to avoid having to do index calculations:

``````julia> function m3(gen,n,m,N)
M = Matrix{Float64}(undef,N*n,N*m)
M4 = reshape(M, (n,N,m,N))
for j in 1:N
for i in 1:N
M4[:,i,:,j] .= gen(n,m)   # n,m reversed from m()
end
end
M
end
m3 (generic function with 1 method)

julia> m3(ones,2,3,4) == m(ones,2,3,4) # not a strong test!
true
``````
2 Likes

Why do we need comprehension if the hvcat syntax is already so compact?

``````M1 = rand(3,5); M2 = rand(3,5); M3 = rand(2,5); M4 = rand(2,5);
MM = hvcat((2),M1,M2,M3,M4)  # concatenate two in each block row
``````

makes a `5×10 Array{Float64,2}:`

2 Likes

This version is also easier for a novice, like a student coming from Matlab, Pythod, or C to understand. That’s a big deal in the education business.

Using @rafael.guerra suggestion an answer to the original question can be

``````hvcat((2), [rand(5, 5) for _ in 1:4]...)
``````
3 Likes

And it is as fast and allocates the same thing as the manual version:

``````julia> @btime hvcat((100), [rand(10, 10) for _ in 1:10000]...)
4.290 ms (10020 allocations: 16.79 MiB)
``````

I guess this is the best answer (is it possible to parallelize this)?

hvcat((2), [rand(5, 5) for _ in 1:4]…)

Could you parse it and explain each term, please?

Thank you all for your suggestions. I did not know hvcat but it seems to be the cleanest syntax while being efficient. Not that it mattered a lot but it’s good to know how to do things efficiently.

@lmiq
I doubt that it can be parallelized, since it’s internal `Base` function. Yet on the other hand, I also doubt that it would be useful, since this function is mostly doing memory operations and memory operations rarely profit from parallelization (memory is much slower than CPU). As an example compare these two implementations (they are not very efficient, but it is not important, since we use the same operations in both versions)

``````using BenchmarkTools

x = [rand(10000) for _ in 1:100]
y = Vector{Float64}(undef, 1000000)

function f1!(y, x)
for i in axes(x, 1)
for j in axes(x[i], 1)
y[(i - 1)*length(x) + j] = x[i][j]
end
end

return y
end

function f2!(y, x)
for j in axes(x[i], 1)
y[(i - 1)*length(x) + j] = x[i][j]
end
end

return y
end
``````

and result is

``````julia> Threads.nthreads()
8

julia> @btime f1!(\$y, \$x);
@btime f2!(\$y, \$x);
1.305 ms (0 allocations: 0 bytes)

julia> @btime f2!(\$y, \$x);
1.201 ms (41 allocations: 5.89 KiB)
``````

The speedup that you observe was mainly related to the parallelization of the generation of the tables, which uses `rand` and it is a rather costly operation.

@Juan
From `hvcat` documentation:

Horizontal and vertical concatenation in one call. This function is called for block matrix
syntax. The first argument specifies the number of arguments to concatenate in each block
row.
If the first argument is a single integer n, then all block rows are assumed to have n block
columns.

So, the first argument to `hvcat` in our example is `(2)` and it means, that we only want two columns. Consider for example

``````julia> hvcat((2), 1, 2, 3, 4, 5, 6)
3×2 Array{Int64,2}:
1  2
3  4
5  6

julia> hvcat((3), 1, 2, 3, 4, 5, 6)
2×3 Array{Int64,2}:
1  2  3
4  5  6
``````

Now, `hvcat` do not accept vectors of matrices as an input, so I should have written

``````hvcat(2, rand(5, 5), rand(5, 5), rand(5, 5), rand(5, 5))
``````

which is of course not scalable and inconvenient if we have a large number of matrices. But `hvcat` is so called Vararg function and I’ve used splatting syntax, when you add ellipsis to the last argument and julia split it into proper number of positional arguments. For easier understanding it can be written in two lines

``````rand_matrices = [rand(5, 5) for _ in 1:4]
hvcat(2, rand_matrices...)  # which during parsing turns internally to hvcat(2, rand_matrices,  rand_matrices, rand_matrices, rand_matrices)
``````

So as a conclusion, I generated vector of 4 matrices 5x5 and said to `hvcat` to reorganize them in two block columns.

@HenriDeh
All kudos should go to @rafael.guerra 4 Likes

Actually in this case it is generating the sub-matrices of dimensions `(m,n)` multiple times, and that can be parallelized (this is what happens in the custom function I provided before, which parallelizes very well, because it is embarrassingly parallel).

In your example you moved the generation of the random matrices out of the benchmark, so that does not correspond exactly to the `hvcat` solution.

More variations on the same theme: is the following a proper way of collecting all the matrices Mi before hvcat?

``````M1 = rand(3,5); M2 = rand(3,5); M3 = rand(2,5); M4 = rand(2,5);
mlist = [M1,M2,M3,M4]
MM = hvcat((2), mlist...) # concatenates two in each block row
``````

It is, because `mlist` is only addressing the positions, i. e.:

``````julia> x = rand(2) ; y = rand(2);

julia> m = [ x, y ]
2-element Array{Array{Float64,1},1}:
[0.2383336264485605, 0.8321815241645099]
[0.5923411990097951, 0.9487571368376246]

julia> m
2-element Array{Float64,1}:
0.2383336264485605
0.8321815241645099

julia> m = 0.
0.0

julia> x
2-element Array{Float64,1}:
0.0
0.8321815241645099

``````
1 Like