Dropping a column from a matrix

I seem to be hitting a wall deciding how to drop a column from a matrix. My background in R tells me I should use a negative index but that is not idiomatic Julia (or even valid Julia). The best I have been able to come up with that would be the equivalent of R’s

> (m <- matrix(rnorm(15), ncol=3))
           [,1]       [,2]       [,3]
[1,]  0.8478572  1.5250919 -0.6781256
[2,] -0.2832645 -1.4822946 -0.7714823
[3,] -0.5664354 -0.3484121  0.2053551
[4,]  1.0551698 -1.1714792 -0.3676856
[5,]  0.1992520  0.1079589  0.3709326
> m[,-2]
           [,1]       [,2]
[1,]  0.8478572 -0.6781256
[2,] -0.2832645 -0.7714823
[3,] -0.5664354  0.2053551
[4,]  1.0551698 -0.3676856
[5,]  0.1992520  0.3709326

is something like

m[:, deleteat!(collect(1:size(m, 2)), 2)]

which seems kind of clunky.

What am I missing?

My solution so far is

julia> dropcol(M::AbstractMatrix, j) = M[:, deleteat!(collect(axes(M, 2)), j)]
dropcol (generic function with 1 method)

julia> m = randn(5,3)
5Γ—3 Array{Float64,2}:
 -0.0407878   0.784904  -0.419678
 -0.366309    2.51954   -0.25255
 -0.223342    0.97832   -0.159929
 -0.331398    1.01538    0.191279
 -0.231566   -0.833418  -0.00727222

julia> dropcol(m, 2)
5Γ—2 Array{Float64,2}:
 -0.0407878  -0.419678
 -0.366309   -0.25255
 -0.223342   -0.159929
 -0.331398    0.191279
 -0.231566   -0.00727222
2 Likes

Not is also reexported by other packages e.g. DataFrames

3 Likes

I think the best approach is the one R uses; negative indexes meaning you want to filter those columns or rows.

My workaround is this:

julia> function drop(M::AbstractMatrix;
                     r=nothing,
                     c=nothing)
           s = size(M)
           dr = collect(1:s[1])
           dc = collect(1:s[2])
           isnothing(r) ? nothing : splice!(dr,r)
           isnothing(c) ? nothing : splice!(dc,c)
           M[dr,dc]
       end
julia> x = rand(5,5)
5Γ—5 Array{Float64,2}:
 0.507316  0.765801  0.932161  0.910824  0.495703
 0.33204   0.878031  0.151499  0.924096  0.184833
 0.122875  0.354409  0.168172  0.095147  0.742384
 0.710832  0.32433   0.985624  0.598247  0.262898
 0.651887  0.362271  0.713403  0.700189  0.577536

julia> drop(x,r=1:3)
2Γ—5 Array{Float64,2}:
 0.710832  0.32433   0.985624  0.598247  0.262898
 0.651887  0.362271  0.713403  0.700189  0.577536

julia> drop(x,c=1:3)
5Γ—2 Array{Float64,2}:
 0.910824  0.495703
 0.924096  0.184833
 0.095147  0.742384
 0.598247  0.262898
 0.700189  0.577536

julia> drop(x,r=1:3,c=1:3)
2Γ—2 Array{Float64,2}:
 0.598247  0.262898
 0.700189  0.577536

For learning purposes, wrote the following drop_rc naive function that allows handy syntax c = (1:2, 4, 6:7) or r =(1, 3:4):

function drop_rc(x; r=nothing, c=nothing)
    h = eltype(x)[]
    nr, nc = size(x)
    if !isnothing(r)
        r = collect(Iterators.flatten(r))
        for i in 1:nr
            i βˆ‰ r ? push!(h, x[i,:]...) : nothing
        end
        return reshape(h, (nc, nr - length(r)))'
    elseif !isnothing(c)
        c = collect(Iterators.flatten(c))
        for i in 1:nc 
            i βˆ‰ c ? push!(h, x[:,i]...) : nothing
        end
        return reshape(h, (nr, nc - length(c),))
    else
        return x
    end
end
julia> x = rand(5,7)
5Γ—7 Array{Float64,2}:
 0.0387938  0.741277  0.517733  0.00555292  0.596444  0.631753  0.0438753
 0.119268   0.235725  0.548981  0.304496    0.630517  0.641583  0.645576
 0.229881   0.549495  0.312879  0.550285    0.970344  0.589595  0.39158
 0.0460248  0.105836  0.984308  0.335502    0.649328  0.893807  0.244832
 0.751559   0.807176  0.7976    0.253244    0.77978   0.471476  0.62436

julia> drop_rc(x;c=(1:2,4,6:7))
5Γ—2 Array{Float64,2}:
 0.517733  0.596444
 0.548981  0.630517
 0.312879  0.970344
 0.984308  0.649328
 0.7976    0.77978

julia> drop_rc(x;r=(1,3:4))
2Γ—7 LinearAlgebra.Adjoint{Float64,Array{Float64,2}}:
 0.119268  0.235725  0.548981  0.304496  0.630517  0.641583  0.645576
 0.751559  0.807176  0.7976    0.253244  0.77978   0.471476  0.62436

Maybe the simplest is to use Sets here (edit: more compact form using the splatting operator):

function drop_rc2(x; r=nothing, c=nothing)
    nr, nc = size(x)
    if !isnothing(r)
        return x[setdiff(1:nr, r...), :]
    elseif !isnothing(c)
        return x[:, setdiff(1:nc, c...)]
    else
        return x
    end
end
julia> x = rand(5,7)
5Γ—7 Array{Float64,2}:
 0.336801   0.230365  0.285252  0.772628  0.357025  0.887883  0.197147
 0.737566   0.621813  0.621547  0.74819   0.525228  0.901342  0.359175
 0.847806   0.493836  0.690468  0.79979   0.87897   0.523288  0.0364366
 0.393608   0.727196  0.54383   0.370507  0.836713  0.618279  0.0198844
 0.0688112  0.531665  0.766093  0.821209  0.670786  0.933345  0.209199

julia> drop_rc2(x; c=(1:2,4,6:7))
5Γ—2 Array{Float64,2}:
 0.285252  0.357025
 0.621547  0.525228
 0.690468  0.87897
 0.54383   0.836713
 0.766093  0.670786

julia> drop_rc2(x; r=(1,3:4))
2Γ—7 Array{Float64,2}:
 0.737566   0.621813  0.621547  0.74819   0.525228  0.901342  0.359175
 0.0688112  0.531665  0.766093  0.821209  0.670786  0.933345  0.209199

Perhaps this is something like what you want?


julia> a = rand(4,3)
4Γ—3 Array{Float64,2}:
 0.478814  0.730819    0.790008
 0.406147  0.467439    0.674457
 0.107321  0.00590769  0.327777
 0.426467  0.619254    0.688245

julia> a[:, (1:end) .!= 2] # drop the second column
4Γ—2 Array{Float64,2}:
 0.478814  0.790008
 0.406147  0.674457
 0.107321  0.327777
 0.426467  0.688245

julia> a[:, (1:end) .βˆ‰ ((2,3),)] # drop columns 2 and 3
4Γ—1 Array{Float64,2}:
 0.4788135247086196
 0.4061471501915541
 0.10732125426868899
 0.4264673627479727
6 Likes

@jishnub, very nice. For a more general list of columns, ex: (1:3, 5, 7:9), do you know if there is a syntax shorter than:

A = rand(Int8,4,9)
c = (1:3,5,7:9)
A[:, (1:end) .βˆ‰ (Tuple(Iterators.flatten(c)),)]
julia> A = rand(Int8,4,9)
4Γ—9 Array{Int8,2}:
 -38  -86     6  127   -95   77    41  -122  106
  37   59  -123  -65   125  -95  -118    20  100
  50  -26    18  106  -111  -82   120   -57   60
  -4   37    24  114  -110  -43   -44   -11  -33
julia> c = (1:3,5,7:9)
julia> A[:, (1:end) .βˆ‰ (Tuple(Iterators.flatten(c)),)]
4Γ—2 Array{Int8,2}:
 127   77
 -65  -95
 106  -82
 114  -43

Do you want to mutate the original matrix, or do you just want a selection from it? For a selection, can’t you just do:

julia> m = rand(5,3)
5Γ—3 Array{Float64,2}:
 0.833451   0.624387   0.673538
 0.138481   0.603381   0.00469182
 0.226015   0.811883   0.618896
 0.0161347  0.0640166  0.929956
 0.114532   0.197485   0.601021

julia> m[:,[1,3]]
5Γ—2 Array{Float64,2}:
 0.833451   0.673538
 0.138481   0.00469182
 0.226015   0.618896
 0.0161347  0.929956
 0.114532   0.601021

Or a one-liner:

julia> m = rand(5,3)
5Γ—3 Array{Float64,2}:
 0.6857     0.500239   0.494641
 0.0204583  0.735167   0.113708
 0.227035   0.701075   0.430557
 0.698452   0.0571221  0.949975
 0.664024   0.868415   0.147735

select_not(m,c) = m[:,setdiff(1:size(m,2),c)] 

julia> select_not(m,2)
5Γ—2 Array{Float64,2}:
 0.6857     0.494641
 0.0204583  0.113708
 0.227035   0.430557
 0.698452   0.949975
 0.664024   0.147735
1 Like

setdiff seems to be the best solution so far, as it allows for cleaner syntax, if c is a more general column selection:

A = rand(Int8,4,9)
c = (1:3,5,7:9)
A[:,setdiff(1:size(A,2), c...)] 
julia> A = rand(Int8,4,9)
4Γ—9 Array{Int8,2}:
 -111     8    88  -64   -56   109   -84   112   51
   82   -98   -72  -64  -120   -25  -114    19  109
   58  -106   -52   98  -116    76    14   -30  115
  -24    11  -118  -49   -64  -100    16  -103  -28

julia> A[:,setdiff(1:size(A,2), c...)] 
4Γ—2 Array{Int8,2}:
 -64   109
 -64   -25
  98    76
 -49  -100
1 Like

This might be more concise:

julia> A = rand(Int8,4,9)
4Γ—9 Array{Int8,2}:
  55   -81  -91  -89  -71  -57  -102  -58   75
  11   113   40  -74   64  125    43  -67    3
 -13   -85  -87  -63  -26   27   -65   54  126
 -43  -117   70  -16   14  -13    32  103   42

julia> A[:, (1:end) .βˆ‰ ([1:3; 5; 7:9],)]
4Γ—2 Array{Int8,2}:
 -89  -57
 -74  125
 -63   27
 -16  -13
3 Likes