Silly question. How to add a row to array?

Very simple question. Please help me.

a= [2 3; 5 6]
push!(a, [1 2]) - does’t work
append!(a, [1 2]) - doesn’t work

thanks

1 Like

vcat (hcat for columns)

2 Likes

thanks,
Way push and append doesn’t work?
Why so many different words ?

not sure, i think it’s related to memory layout, you need allocate additional space. vcat will copy all data to a new array. search docs for “concatenation”

But I don’t need to create new a array, it will be very inefficiently!!

Unfortunately, there is no vcat!

Does anybody know how to add, not create new one?

thanks

I really understand your pain:

If you need to add new row and still have Array, not lazy iterator, you need to move data to make place for this new bytes. Allocating new array may be perfect, just copy data, and GC will free old one.

I’m not an expert, but when I used MATLAB, dynamically adding rows was a bad idea. The recommended strategy in MATLAB is to know the size of the object in advance, and then just insert elements in the allocated space. In other words, in MATLAB:

x = []
for i = 1:10
   x = [x;i]
end

is slow, because you constantly need to shuffle around x in memory, and reallocate space.
The faster solution is (apologies if I have forgotten the correct MATLAB syntax):

x = zeros(10)
for i = 1:10 
   x[i] = i;
end

I would assume that similar ideas are valid in Julia. In other words, if you know the number of rows in your array, you allocate the memory at the outset, and fill in the values. Efficiency may also be gained by filling in column-wise, etc., I assume.

I’m sure some Julia experts can correct me, and point to the best ways to do this.

2 Likes

[1:10;] or [i for i in 1:10] or will do the same, much shorter. @BMval, could you give more context what you are trying to? What is the reason you need concatenation?

dear @Jakub_Wronowski, I think @BLI is right, it’s better to allocate the memory at the begging since I know exactly, how many I need.

Thank @BLI, @Jakub_Wronowski, and others.

1 Like

A 2d Array is inherently a bad data structure if you want to add rows dynamically. Julia arrays are stored column-major (contiguous columns), which means that even if you resized the memory to accommodate a new row, you would need to insert a new element into each column by moving every column in memory.

One possibility here is to use an vector of arrays, e.g.

julia> a = [[2,3], [5,6]]
2-element Array{Array{Int64,1},1}:
 [2, 3]
 [5, 6]

julia> push!(a, [1,2])
3-element Array{Array{Int64,1},1}:
 [2, 3]
 [5, 6]
 [1, 2]

It would also be possible to define your own AbstractMatrix subtype that allowed you to append rows effiicently to contiguous memory, by implementing a 2d “view” into an underlying 1d array interpreted in row-major order, e.g.

mutable struct MyMatrix{T} <: AbstractMatrix{T}
    m::Int
    n::Int
    data::Vector{T}
end
Base.size(a::MyMatrix) = (a.m, a.n)
Base.getindex(a::MyMatrix, i::Integer, j::Integer) = a.data[(Int(i)-1)*a.n + Int(j)] # row-major
Base.setindex!(a::MyMatrix, v, i::Integer, j::Integer) = setindex!(a.data, v, (Int(i)-1)*a.n + Int(j))
MyMatrix{T}(::UndefInitializer, m::Integer, n::Integer) where {T} = MyMatrix{T}(m, n, Array{T}(undef, m*n))
MyMatrix(a::AbstractMatrix{T}) where {T} = copyto!(MyMatrix{T}(undef, size(a)...), a)

function Base.push!(a::MyMatrix, row::AbstractVector)
    a.n == length(row) || throw(DimensionMismatch("row size must match matrix"))
    resize!(a.data, length(a) + a.n)
    a.data[length(a)+1:length(a.data)] = row
    a.m += 1
    return a
end

at which point you can (fairly efficiently) do:

julia> a = MyMatrix([2 3; 5 6])
2×2 MyMatrix{Int64}:
 2  3
 5  6

julia> push!(a, [1,2])
3×2 MyMatrix{Int64}:
 2  3
 5  6
 1  2

(You will probably need to define more array methods for MyMatrix depending on what you want to do with it, however. The good news is that, assuming you know what you are doing, you can implement your own array type like this and make it just as fast and flexible as the “built in” Matrix type, but specialized to allow appended rows.)

The basic question here is what you want to do with the resulting array. That will help determine what data structure you want to use, but you haven’t explained anything about your application yet.

Yes, if you know the size of your data in advance, it is nearly always better to preallocate.

16 Likes

Never underestimate the power of a language with (near-)zero cost abstractions, parametric types, and multiple dispatch :sunglasses: :

10 Likes

That’s why I said “a 2d Array” and not “a 2d AbstractArray”. An Array is a specific data structure. An AbstractArray is an interface, not a data structure — as I pointed out with the MyMatrix example, you can certainly provide efficient appendable rows within the array interface, but not with the 2d Array data structure because the latter is column-major.

The ElasticArrays package is a nice example, but they only allow the last dimension to grow or shrink in-place (you can append columns), because they are also built on top of column-major storage. If you want the first dimension to grow or shrink, you need to transpose to row-major storage as in my example.

(If you wanted to grow or shrink any dimension in-place, you would need to switch to a different data structure entirely, e.g. a vector of vectors or a column-major matrix with padding allocated in each column, but you could still use the AbstractArray interface.)

6 Likes

Good point — I missed the `s around the Array. Thanks for clarifying.