Silly question. How to add a row to array?

BMval · January 20, 2020, 9:49pm

Very simple question. Please help me.

a= [2 3; 5 6]
push!(a, [1 2]) - does’t work
append!(a, [1 2]) - doesn’t work

thanks

Jakub_Wronowski · January 20, 2020, 9:51pm

vcat (hcat for columns)

BMval · January 20, 2020, 9:53pm

thanks,
Way push and append doesn’t work?
Why so many different words ?

Jakub_Wronowski · January 20, 2020, 10:01pm

not sure, i think it’s related to memory layout, you need allocate additional space. vcat will copy all data to a new array. search docs for “concatenation”

BMval · January 20, 2020, 10:09pm

But I don’t need to create new a array, it will be very inefficiently!!

Unfortunately, there is no vcat!

Does anybody know how to add, not create new one?

thanks

Jakub_Wronowski · January 20, 2020, 10:14pm

I really understand your pain:

github.com/JuliaLang/julia

Extrema, flattened iterator and too much memory alloation

opened 12:27PM - 15 Jan 20 UTC

closed 07:29AM - 05 Jul 20 UTC

jakubwro

fold

Hi, The issue I am reporting is related to discourse conversation: https://di…scourse.julialang.org/t/concatenating-iterables-without-allocating-memory/33282 Also I commented some details to other issue: https://github.com/JuliaLang/julia/issues/31442#issuecomment-573939429 I encountered it using `extrema` function with flattened iterator, but here I am providing more distilled example. First of all I need 2 big arrays which we are going to concatenate with a lazy iterator. ``` const signal1 = rand(10000000) const signal2 = rand(10000000) const flat = Iterators.flatten((signal1, signal2)) ``` Now I am going to define a functions that will iterate this flattened data. They are based on `extrema` implementation. ``` julia function works_ok(itr) y = iterate(itr) (v, s) = y y = iterate(itr, s) while y !== nothing (v, s) = y y = iterate(itr, s) end return v end function gives_strange_result(itr) y = iterate(itr) (v, s) = y while y !== nothing y = iterate(itr, s) y === nothing && break (v, s) = y end return v end ``` The second one has strange timings and memory consumption. ``` julia> @time works_ok(flat) 0.024651 seconds (7 allocations: 240 bytes) 0.9342115147070622 julia> @time gives_strange_result(flat) 0.252765 seconds (20.00 M allocations: 610.352 MiB, 22.57% gc time) 0.9342115147070622 ``` There is no memory leak, I tried to run in a loop 1000 times.

If you need to add new row and still have Array, not lazy iterator, you need to move data to make place for this new bytes. Allocating new array may be perfect, just copy data, and GC will free old one.

BLI · January 20, 2020, 10:28pm

I’m not an expert, but when I used MATLAB, dynamically adding rows was a bad idea. The recommended strategy in MATLAB is to know the size of the object in advance, and then just insert elements in the allocated space. In other words, in MATLAB:

x = []
for i = 1:10
   x = [x;i]
end

is slow, because you constantly need to shuffle around x in memory, and reallocate space.
The faster solution is (apologies if I have forgotten the correct MATLAB syntax):

x = zeros(10)
for i = 1:10 
   x[i] = i;
end

I would assume that similar ideas are valid in Julia. In other words, if you know the number of rows in your array, you allocate the memory at the outset, and fill in the values. Efficiency may also be gained by filling in column-wise, etc., I assume.

I’m sure some Julia experts can correct me, and point to the best ways to do this.

Jakub_Wronowski · January 20, 2020, 10:32pm

[1:10;] or [i for i in 1:10] or will do the same, much shorter. @BMval, could you give more context what you are trying to? What is the reason you need concatenation?

BMval · January 20, 2020, 10:34pm

dear @Jakub_Wronowski, I think @BLI is right, it’s better to allocate the memory at the begging since I know exactly, how many I need.

Thank @BLI, @Jakub_Wronowski, and others.

stevengj · January 20, 2020, 10:46pm

A 2d Array is inherently a bad data structure if you want to add rows dynamically. Julia arrays are stored column-major (contiguous columns), which means that even if you resized the memory to accommodate a new row, you would need to insert a new element into each column by moving every column in memory.

One possibility here is to use an vector of arrays, e.g.

julia> a = [[2,3], [5,6]]
2-element Array{Array{Int64,1},1}:
 [2, 3]
 [5, 6]

julia> push!(a, [1,2])
3-element Array{Array{Int64,1},1}:
 [2, 3]
 [5, 6]
 [1, 2]

It would also be possible to define your own AbstractMatrix subtype that allowed you to append rows effiicently to contiguous memory, by implementing a 2d “view” into an underlying 1d array interpreted in row-major order, e.g.

mutable struct MyMatrix{T} <: AbstractMatrix{T}
    m::Int
    n::Int
    data::Vector{T}
end
Base.size(a::MyMatrix) = (a.m, a.n)
Base.getindex(a::MyMatrix, i::Integer, j::Integer) = a.data[(Int(i)-1)*a.n + Int(j)] # row-major
Base.setindex!(a::MyMatrix, v, i::Integer, j::Integer) = setindex!(a.data, v, (Int(i)-1)*a.n + Int(j))
MyMatrix{T}(::UndefInitializer, m::Integer, n::Integer) where {T} = MyMatrix{T}(m, n, Array{T}(undef, m*n))
MyMatrix(a::AbstractMatrix{T}) where {T} = copyto!(MyMatrix{T}(undef, size(a)...), a)

function Base.push!(a::MyMatrix, row::AbstractVector)
    a.n == length(row) || throw(DimensionMismatch("row size must match matrix"))
    resize!(a.data, length(a) + a.n)
    a.data[length(a)+1:length(a.data)] = row
    a.m += 1
    return a
end

at which point you can (fairly efficiently) do:

julia> a = MyMatrix([2 3; 5 6])
2×2 MyMatrix{Int64}:
 2  3
 5  6

julia> push!(a, [1,2])
3×2 MyMatrix{Int64}:
 2  3
 5  6
 1  2

(You will probably need to define more array methods for MyMatrix depending on what you want to do with it, however. The good news is that, assuming you know what you are doing, you can implement your own array type like this and make it just as fast and flexible as the “built in” Matrix type, but specialized to allow appended rows.)

The basic question here is what you want to do with the resulting array. That will help determine what data structure you want to use, but you haven’t explained anything about your application yet.

Yes, if you know the size of your data in advance, it is nearly always better to preallocate.

Tamas_Papp · January 21, 2020, 7:34am

Never underestimate the power of a language with (near-)zero cost abstractions, parametric types, and multiple dispatch :

stevengj · January 21, 2020, 12:20pm

That’s why I said “a 2d Array” and not “a 2d AbstractArray”. An Array is a specific data structure. An AbstractArray is an interface, not a data structure — as I pointed out with the MyMatrix example, you can certainly provide efficient appendable rows within the array interface, but not with the 2d Array data structure because the latter is column-major.

The ElasticArrays package is a nice example, but they only allow the last dimension to grow or shrink in-place (you can append columns), because they are also built on top of column-major storage. If you want the first dimension to grow or shrink, you need to transpose to row-major storage as in my example.

(If you wanted to grow or shrink any dimension in-place, you would need to switch to a different data structure entirely, e.g. a vector of vectors or a column-major matrix with padding allocated in each column, but you could still use the AbstractArray interface.)

Tamas_Papp · January 21, 2020, 12:32pm

Good point — I missed the `s around the Array. Thanks for clarifying.

Topic		Replies	Views
Adding rows to a matrix dynamically Performance arrays , matrices	4	3247	January 7, 2021
How to do in place concatenation of matrices New to Julia question	6	2165	March 6, 2021
How to use push!() in a two-dimensional matrix General Usage question	30	6896	August 22, 2021
Using push! to add a new row of data General Usage question	13	1131	August 31, 2022
How does one add rows to a Julia array? New to Julia	6	21933	July 4, 2020

Silly question. How to add a row to array?

Related topics