Is there a Julia equivalent of Matlab's Array(i) = Value for undefined indicies?

As the title suggest I was curious if Julia base or LinearAlgebra had some existing equivalent to Matlab’s implementation of assigning values to a non-exisiting index:

>> Array = [];
Array(3) = 3; Array(5) = 5; Array(3,6:7) = [6, 7];
Array
>> Array =
     0     0     3     0     5    0     0
     0     0     0     0     0    0     0
     0     0     0     0     0    6     7

As you can see, it fills or preallocates the previous values with zeros. I was hoping that there would be already some optimized and general implementation of this out there for Julia. I already tried making my own function (see below), but ran into problems with append! only seeming to accept vectors. I believe I still need a lot of checks in order to generalize it, since I would want it to be able to do something like “Array(3,6:7) = [6, 7]” in the example above.

function writeI!(Array,Index,Values)
    for i = 1:length(Index)
        j = Index[i]
        if j <= length(Array)
            Array[j] = Values[i]
        else
            append!(Array,zeros(j-1-length(Array)),Values[i])
        end
    end
end

Edit: I said base or LinearAlgebra, but if there are already other packages that do this I would like to know as well

I don’t think so. afaik the closest is

julia> a = zeros(Int, 3,6)
3×6 Matrix{Int64}:
 0  0  0  0  0  0
 0  0  0  0  0  0
 0  0  0  0  0  0

julia> a[3] = 3
3

julia> a
3×6 Matrix{Int64}:
 0  0  0  0  0  0
 0  0  0  0  0  0
 3  0  0  0  0  0

It would be better to preallocate the array before processing (I suppose that input arrays are vector):

function writeI(arr1::Vector,index::Vector,values::Vector)
    # define arr2
    max_idx = maximum(index)
    if max_idx > length(arr1)
        arr2 = similar(arr1, max_idx)
        for (idx, val) = zip(eachindex(arr1),arr1)
            arr2[idx] = val
        end
    else
        arr2 = copy(arr1)
    end
    # insert values
    for (idx, val) = zip(index, values)
        arr2[idx] = val
    end
    arr2
end

Actually, Matlab also recommends to preallocate array before processing, as you have seen warnings while writing the Matlab script. Computer need to reallocate memory if you change its size in the middle of execution. This is a burden to both Matlab and Julia. Julia does not support this feature bacause it is undesired habit.

Operations inducing the (possible) change of allocated size is not inplace operation. So your function won’t get benefits from fixing the variable name (or worse).

I also recommend to change your variable names. Array is predefined. Official style guide will be helpful.

4 Likes

Thank you both. It is a bit unfortunate that there is no existing function for this yet, since it makes it a bit more difficult to port some of my Matlab scripts to Julia, but I suppose I will preallocate the sizes for now until I have some more time to make this work with matricies as well.

Even in Matlab, growing arrays this way is inefficient. I would not use it for anything but a crude and quick hack… :slight_smile:

It is trivial to implement something like this (note: I didn’t optimize or test this code thoroughly, just a proof of concept), eg

mutable struct ExpandingArray{T,N} <: AbstractArray{T,N}
    contents::Array{T,N}
    filler::T
    function ExpandingArray{N}(filler::T) where {T,N}
        @assert N ≥ 1
        contents = Array{T}(undef, ntuple(_ -> 0, Val(N)))
        new{T,N}(contents, filler)
    end
end

Base.size(a::ExpandingArray) = size(a.contents)

Base.getindex(a::ExpandingArray, i...) = a.contents[i...]

function Base.setindex!(a::ExpandingArray{T,N}, x, i...) where {T,N}
    @assert length(i) == N
    D = size(a.contents)
    if any(i .> D)
        D′ = max.(D, i)
        c′ = fill(a.filler, D′)
        c′[axes(a.contents)...] .= a.contents
        a.contents = c′
    end
    a.contents[i...] = x
end

Eg

julia> e = ExpandingArray{2}(0.0)
0×0 ExpandingArray{Float64, 2}

julia> e[1,3] = 3.0
3.0

julia> e[4,2] = 5.0
5.0

julia> e
4×3 ExpandingArray{Float64, 2}:
 0.0  0.0  3.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  5.0  0.0

that can tide you over while you are porting your code, but as others have remarked, it is much more efficient to preallocate the correct size.

11 Likes

SparseArrays offer an alternative:

julia> using SparseArrays
julia> row_ix = [1, 1, 3, 3] # row indices of vals
julia> col_ix = [3, 5, 6, 7] # col indices of vals
julia> vals = [3, 5, 6, 7]
julia> sparse(row_ix, col_ix, vals)
3×7 SparseMatrixCSC{Int64, Int64} with 4 stored entries:
 ⋅  ⋅  3  ⋅  5  ⋅  ⋅
 ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 ⋅  ⋅  ⋅  ⋅  ⋅  6  7
5 Likes

Is this actually default behaviour for arrays in Matlab? I.e. setindex! can never be out of bounds because it just changes the size of my array from under me? In addition to the performance pitfalls that seems like a massive potential correctness footgun?

6 Likes

Yes but: Making Matlab throw an error on index out of bounds - MATLAB Answers - MATLAB Central (mathworks.com)

Doesn’t resize after instantiation, however.

I think the only way the performance could be improved is if contents was a Vector or whatever that upcoming lower-level memory type, so that the ExpandingArray could be resized without allocating a whole new array for contents, though extending a vector still allocates sometimes. This is ElasticArray.jl’s approach. However, resizing a matrix to more rows isn’t a simple extension, best case scenario is starting from the last column and shifting columns one by one, which moves all existing elements like ExpandingArray does. ElasticArrays.jl doesn’t bother implementing that for higher dimensions, it only alters the last dimension. I remember 3D arrays in MATLAB being indexing-resizable, never tried for higher dimensions.

1 Like

I think that the Matlab strategy for expanding arrays when setting elements beyond the current bounds is similar to Julia’s push! method, at least for Vectors. So push! could be used in Julia when the index is just one after the current length, which is a common use case. The performance would be not so bad. Also an ExpandingVector type seems possible.

I think you misunderstand: push! just appends one element at the end.

Yes, I thought about this too, but wanted to do something quick & dirty. Also, ideally, when dimensions are large anyway, resizing would become increasingly rare as the algorithm progresses, with random access if the matrix is otherwise dense-ish.

Another option would be overallocating along each dimension, like push! does currently under the hood.

I fully agree that making this the default is not a good choice, but having it as an option is great, eg when the final size is not known or tricky to calculate. One thing I like about Julia is how easy it is to prototype something simple, test it out, and then optimize if it is worth it.

2 Likes

Especially as a distinct type. As terrible the performance of indexing-resizing arrays is, the overhaul is semantically more palatable in MATLAB because of the different variables model. For example:

A = [1 2; 3 4]
B = A
A(3, 3) = 10    % A is 3x3
B               % B is still 2x2

Whereas the naive “equivalent” Julia using your ExpandingArray would keep A and B assigned to the same instance. Since copying is needed to keep A and B separate anyway, having separate interconverting types can better tune performance, e.g. B::Matrix if it’s never resized.

1 Like

Yes, agree. But it also pre-allocates for more than one additional element, typically doubles the space. Subsequent push!es then don’t allocate and copy, they just expand the array into preallocated space. And Matlab does automatically do similarly. The here suggested ExpandingArray might have poor performance compared to Matlab’s “building a vector by indexing beyond the current length” for the common use pattern that a vector is filled by appending elements one-by-one. This use case should in Julia become push!.

Yeah, sure. As I said, it is a proof of concept coded up in five minutes, not an optimized implementation.

Not sure about overallocating though. Overallocating \varphi > 1 in every N dimension takes up \varphi^N extra space. For N = 1 it is fine, but for eg N = 3 it can be sizeable. (#40453 is the latest careful optimization for the one-dimensional case balancing memory and reallocation, I imagine that the multidimensional equivalent for random access is much more tricky.)

I don’t want to steer the discussion away from Julia but you actually never want to use this in Matlab (or, let’s say very, very rarely). As far as I understand this is not the best practice in Julia either.

For large matrices sparse array would be a good idea probably.

1 Like

I wanted to take a stab at an alternative solution using a dictionary:

struct DictArray{T<:Number,N} <: AbstractArray{T,N}
	data::Dict{NTuple{N, Int}, T}
	DictArray{T,N}() where {T,N} = new{T,N}(Dict{NTuple{N, Int}, T}())
end

function Base.size(d::DictArray{T,N}, dim) where {T,N}
	1 ≤ dim ≤ N || return 1
	return maximum(x -> x[dim], keys(d.data); init=0)
end

Base.size(d::DictArray{T,N}) where {T,N} = Tuple(size(d, i) for i in 1:N)

function Base.setindex!(d::DictArray{T,N}, v, inds::Vararg{Int,N}) where {T,N}
	all(>(0), inds) || throw(BoundsError(d, inds))
	d.data[inds] = convert(T, v)
end

function Base.getindex(d::DictArray{T,N}, inds::Vararg{Int,N}) where {T,N}
	all(>(0), inds) || throw(BoundsError(d, inds))
	return haskey(d.data, inds) ? d.data[inds] : zero(T)
end

and then you can use it like this:

D = DictArray{Real, 2}()

D[1,1] = 7 # size is now 1x1
D[3,8] = -1 # size is now 3x8

M = Matrix(D) # returns a Matrix{Real}

using SparseArrays
S = sparse(D) # returns a SparseMatrixCSC{Real, Int64}
3 Likes