Efficiently deleting a column or row in a matrix


#1

What is the most efficient way to delete a column or a row in a matrix without reallocating the whole matrix?


#2

You could perhaps keep elements in a vector, relocate them when deleting, resize!, and reshape into a matrix. But I doubt that the speed gain, if any, is worth the complication.

I you are resizing repeatedly (in a loop) and allocation is a bottleneck, pre-allocate a buffer matrix for the result.


#3

I see so there is no straightforward way to do it.

Apparently, this comes close but the array of indices is still allocated, and there seems to be no way to index on tuple generators instead, or is there?

a = @view a[[collect(1:i-1); collect(i+1:end)], :]

Also the result here is a SubArray not an Array which may cause type assertion problems in my use case. And it seems that converting it to Array comes with reallocation cost.


#4

Maybe you should program against AbstractArray and not Array.


#5

Right I thought about it, but my array shows up in:

type T1
    a::Dict{String, Array}
end

and changing to Dict{String, AbstractArray} will cause dispatch problems when calling new inside a constructor because obviously Dict{String, Array}() is not a Dict{String, AbstractArray}.

If I use a type parameter T as follows, I will be committing to a certain concrete type when the object is made, and changing it later may give an error or attempt to call convert which brings us back to allocations.

type T1{T<:AbstractArray}
    a::Dict{String, T}
end

#6

Create it as a properly-typed SubArray at the outset, just don’t leave out any rows/columns?


#7

Right, that will work but only until it doesn’t! Let’s say I need to get the array at any point to pass it to a PyCall function for example, even for the following simple case, allocations will grow beyond reason.

julia> a = rand(5000,5000);

julia> b = @view a[:,:];

julia> @time Array(b);
  0.428287 seconds (6 allocations: 190.735 MB, 52.11% gc time)

#8

Throwing PyCall into the mix is rather a large expansion of your request here; if you stick with pure Julia code we have a good strategy in place, but making it always possible to call Python (which doesn’t support most Julia types) without allocating memory is beyond the scope of stuff that you should expect to Just Work.

Perhaps try the strategy advocated in the first reply you got.


#9

Right, I understand that I am asking for too much, kind of a side effect of being spoiled by Julia! But perhaps another way is to replace Dict{String, Array} with Vector{Tuple{String, AbstractArray}}, and changing the rest of the code accordingly.

a = rand(5,5);
b = @view a[:,:];
c = Tuple{String, AbstractArray}[("v1", a), ("v2", b)];

That way if a matrix was never sliced it won’t have to be reallocated when passing it as an array.