How to remove rows containing missing from DataFrame?


#1

Hi, given this DataFrame :

A=DataFrame(a=[1:2;missing;4], b=1:4)
4×2 DataFrames.DataFrame
│ Row │ a       │ b │
├─────┼─────────┼───┤
│ 1   │ 1       │ 1 │
│ 2   │ 2       │ 2 │
│ 3   │ missing │ 3 │
│ 4   │ 4       │ 4 │

eltypes(A)
2-element Array{Type,1}:
 Union{Int64, Missings.Missing}
 Int64

how can I remove missings so to get :

3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1   │ 1 │ 1 │
│ 2   │ 2 │ 2 │
│ 3   │ 4 │ 4 │

eltypes(ans)
2-element Array{Type,1}:
 Int64
 Int64

Thanks for help.


#2

Use completecases to get the rows that are not missing, then index into A:

A[completecases(A), :]

#3

Oh and then if you want the column types to not allow for missing values, you will need to call disallowmissing! on the result as well. So:

julia> A = DataFrame(a = [1:2;missing;4], b = 1:4)
4×2 DataFrames.DataFrame
│ Row │ a       │ b │
├─────┼─────────┼───┤
│ 1   │ 1       │ 1 │
│ 2   │ 2       │ 2 │
│ 3   │ missing │ 3 │
│ 4   │ 4       │ 4 │

julia> eltypes(A)
2-element Array{Type,1}:
 Union{Int64, Missings.Missing}
 Int64                         

julia> B = A[completecases(A), :]
3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1   │ 1 │ 1 │
│ 2   │ 2 │ 2 │
│ 3   │ 4 │ 4 │

julia> eltypes(B)
2-element Array{Type,1}:
 Union{Int64, Missings.Missing}
 Int64                         

julia> disallowmissing!(B)
3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1   │ 1 │ 1 │
│ 2   │ 2 │ 2 │
│ 3   │ 4 │ 4 │

julia> eltypes(B)
2-element Array{Type,1}:
 Int64
 Int64

#4

Thank you!


#5

dropmissing and dropmissing! are slightly more convenient ways of doing this.