Repeat and reshape vector into a matrix

Hi,

this is a Python code I’m translating in Julia:

import numpy as np

val = [1,2,3,4,5,6]
val_length = len(val)
rpt = 3

np.repeat(val, rpt).reshape((val_length, rpt))

It gives:

>>> val = [1,2,3,4,5,6]
>>> val_length = len(val)
>>> 
>>> rpt = 3
>>> np.repeat(val, rpt).reshape((val_length, rpt))
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3],
       [4, 4, 4],
       [5, 5, 5],
       [6, 6, 6]])

I used Julia’s repeat for the first part, but the reshape part is driving me crazy. I tried many combinations of reshape, vcat, reduce(vcat… and so on and I never ever had the wanted result.
I’m quite ashamed and I can’t end up pushing into empty arrays as usual when I’m stuck because I have a great need for performance as this block of code will run gazillions of times.

Any help is welcome.

You mean like this?

julia> [0+i for i ∈ 1:6, j ∈ 1:3]
6×3 Matrix{Int64}:
 1  1  1
 2  2  2
 3  3  3
 4  4  4
 5  5  5
 6  6  6

although when you say

I have a great need for performance as this block of code will run gazillions of times.

this sounds incompatible with materialising a matrix with lots of (potentially?) redundant entries.

4 Likes

Since you’re not in Python anymore [sigh of relief] you can also just preallocate a matrix with the correct size and fill its coefficients within a for loop. Usually it is more explicit and not necessarily slower than the vectorized approach.

6 Likes

The translation you were trying to do with standard functions would be:

julia> reshape(repeat(1:6, 3), 6, 3)
6×3 Matrix{Int64}:
 1  1  1
 2  2  2
 3  3  3
 4  4  4
 5  5  5
 6  6  6

It’s not memory efficient, not in NumPy either. If you need a Matrix though, the most you can improve things is save an allocation or two.

2 Likes

If the matrix you want is:

6×3 Matrix{Int64}:
 1  1  1
 2  2  2
 3  3  3
 4  4  4
 5  5  5
 6  6  6

then here are three ways:

val = [1:6;]
@benchmark repeat(val, 1, 3)
# 128.698 ns Memory estimate: 304 bytes, allocs estimate: 3.
@benchmark val .* ones(Int, 1, 3)
# 117.355 ns Memory estimate: 320 bytes, allocs estimate: 3.
buffer = Matrix{Int}(undef, 6, 3)
@benchmark buffer .= val 
# 53.311 ns Memory estimate: 16 bytes, allocs estimate: 1.
2 Likes

Use in-place broadcasting and @inbounds macro which is safe here coz output is pre-allocated to the correct size (length(val), rpt).

repeat_reshape!(output, val, rpt) = (@inbounds output .= reshape(val, :, 1); output)

val = [1:6;]
rpt = 3
output = Matrix{Int}(undef, length(val), rpt)  # pre-allocate output matrix
result = repeat_reshape!(output, val, rpt)
2 Likes

Thanks everyone.
Your answers made me realize that the Python code was too complex for this simple task, and as I was looking for a 1:1 algorithm I lost myself.

julia> val = [1,2,3,4,5,6]
6-element Vector{Int64}:
 1
 2
 3
 4
 5
 6

julia> rpt = 3
3

julia> repeat(val, 1, rpt)
6×3 Matrix{Int64}:
 1  1  1
 2  2  2
 3  3  3
 4  4  4
 5  5  5
 6  6  6

:100: If you care about the performance of this operation, you’re probably doing something wrong. How are you planning to use this matrix?

2 Likes

You’re totally right, this part of the Python code (not written by me) is quite ugly.
But for now I’m in the 1:1 part of the translation as I want to run an operational code ASAP. I’ll optimize later.

If you’re optimizing later, why are you worried about the performance of this piece of the computation? You’re almost certainly going to eliminate it completely later.

(There’s a good chance that line-by-line translation of Pythonic code will get slower in Julia. Don’t be discouraged by this — idiomatic Julia code has lots of ways to speed things up that are unavailable in Python, because loops in Julia are fast. A line-by-line translation may be useful for correctness checking, but eventually the code is likely to look completely different since you don’t have to bend over backwards to “vectorize” everything.)