Array manipulation

OK… this is probably a silly question… I have a vector of vectors, say:

> a = [[1,2], [2,3], [3,4]]

and I want to “reshape” it into a matrix of dimension either (in this case) 2\times 3 or 3\times 2… One way to do this is the following (2\times 3):

> m = zeros(2,0)
> for e in a
       global m = hcat(m,e)
   end

But there must be a simpler way to do this??

1 Like

reduce(hcat, a) should work, I think.

5 Likes

Or

hcat(a...)
1 Like

Thanks to Kristoffer and Adrien! Both methods work!

My problem was unrelated to Flux, but I also found that Flux.batch provides the same functionality… almost. In fact, the above suggestions seem to be more general than Flux.batch:

julia> a = [rand([0,1,2,3],3) for i in 1:4]
4-element Array{Array{Int64,1},1}:
 [2, 0, 2]
 [2, 0, 3]
 [1, 0, 0]
 [1, 1, 0]

julia> hcat(a...)
3×4 Array{Int64,2}:
 2  2  1  1
 0  0  0  1
 2  3  0  0

julia> using Flux

julia> Flux.batch(a)
3×4 Array{Int64,2}:
 2  2  1  1
 0  0  0  1
 2  3  0  0

julia> b = [rand([0,1,2,3],1) for i in 1:4]
4-element Array{Array{Int64,1},1}:
 [3]
 [0]
 [3]
 [3]

julia> hcat(b...)
1×4 Array{Int64,2}:
 3  0  3  3

julia> Flux.batch(b)
1×4 Array{Int64,2}:
 3  0  3  3

julia> c = rand([0,1,2,3],4)
4-element Array{Int64,1}:
 3
 2
 2
 1

julia> hcat(c...)
1×4 Array{Int64,2}:
 3  2  2  1

julia> Flux.batch(c)
4-element Array{Int64,1}:
 3
 2
 2
 1

My “problem” arouse out of calling a function, say, f(x,y), which returns [v,w] where v and w are scalars. Then, with X a vector, f.(X,y) returns a vector of two element vectors of type [ [v1,w1], [v2,w2],...] where I wanted to plot V=[v1,v2,...] as a function of X

Anyway, thanks!

Minor follow-up: I benchmarked the three mentioned methods:

julia> using BenchmarkTools, Flux
julia> a = [rand(4) for i in 1:20];
julia> @benchmark Flux.batch(a)
BenchmarkTools.Trial:
  memory estimate:  736 bytes
  allocs estimate:  1
  --------------
  minimum time:     343.684 ns (0.00% GC)
  median time:      490.051 ns (0.00% GC)
  mean time:        594.442 ns (3.81% GC)
  maximum time:     6.372 μs (78.94% GC)
  --------------
  samples:          10000
  evals/sample:     206

julia> @benchmark hcat(a...)
BenchmarkTools.Trial:
  memory estimate:  1.00 KiB
  allocs estimate:  5
  --------------
  minimum time:     1.140 μs (0.00% GC)
  median time:      1.220 μs (0.00% GC)
  mean time:        1.460 μs (4.27% GC)
  maximum time:     165.370 μs (97.58% GC)
  --------------
  samples:          10000
  evals/sample:     10

julia> @benchmark reduce(hcat,a)
BenchmarkTools.Trial:
  memory estimate:  736 bytes
  allocs estimate:  1
  --------------
  minimum time:     376.617 ns (0.00% GC)
  median time:      400.498 ns (0.00% GC)
  mean time:        474.698 ns (4.64% GC)
  maximum time:     6.445 μs (86.40% GC)
  --------------
  samples:          10000
  evals/sample:     201

So… for casual use, hcat(a...) seems simplest, but reduce(hcat,a) is considerably more efficient. And it seems like Flux.batch() uses reduce(hcat,a) or a similarly efficient command.

1 Like

I think Flux.batch is explicitly copying into a new array, here’s the source. But indeed reduce(hcat,a) got nicely optimised at some point for this purpose. Splatting a vector not a tuple as in hcat(a...) will tend to be slow.

2 Likes