How to remove rows containing missing from DataFrame?

Hi, given this DataFrame :

A=DataFrame(a=[1:2;missing;4], b=1:4)
4×2 DataFrames.DataFrame
│ Row │ a       │ b │
├─────┼─────────┼───┤
│ 1   │ 1       │ 1 │
│ 2   │ 2       │ 2 │
│ 3   │ missing │ 3 │
│ 4   │ 4       │ 4 │

eltypes(A)
2-element Array{Type,1}:
 Union{Int64, Missings.Missing}
 Int64

how can I remove missings so to get :

3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1   │ 1 │ 1 │
│ 2   │ 2 │ 2 │
│ 3   │ 4 │ 4 │

eltypes(ans)
2-element Array{Type,1}:
 Int64
 Int64

Thanks for help.

Use completecases to get the rows that are not missing, then index into A:

A[completecases(A), :]
7 Likes

Oh and then if you want the column types to not allow for missing values, you will need to call disallowmissing! on the result as well. So:

julia> A = DataFrame(a = [1:2;missing;4], b = 1:4)
4×2 DataFrames.DataFrame
│ Row │ a       │ b │
├─────┼─────────┼───┤
│ 1   │ 1       │ 1 │
│ 2   │ 2       │ 2 │
│ 3   │ missing │ 3 │
│ 4   │ 4       │ 4 │

julia> eltypes(A)
2-element Array{Type,1}:
 Union{Int64, Missings.Missing}
 Int64                         

julia> B = A[completecases(A), :]
3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1   │ 1 │ 1 │
│ 2   │ 2 │ 2 │
│ 3   │ 4 │ 4 │

julia> eltypes(B)
2-element Array{Type,1}:
 Union{Int64, Missings.Missing}
 Int64                         

julia> disallowmissing!(B)
3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1   │ 1 │ 1 │
│ 2   │ 2 │ 2 │
│ 3   │ 4 │ 4 │

julia> eltypes(B)
2-element Array{Type,1}:
 Int64
 Int64
6 Likes

Thank you!

1 Like

dropmissing and dropmissing! are slightly more convenient ways of doing this.

6 Likes

What’s the nicest way to filter missings out based on a single column? E.g. if the original example were

│ Row │ a       │ b       │
├─────┼─────────┼─────────┤
│ 1   │ 1       │ missing │
│ 2   │ 2       │ 2       │
│ 3   │ missing │ 3       │
│ 4   │ 4       │ 4       │

And I wanted to filter out the missings only in column a so that I got

│ Row │ a │ b       │
├─────┼───┼─────────┤
│ 1   │ 1 │ missing │
│ 2   │ 2 │ 2       │
│ 3   │ 4 │ 4       │

Found the answer in the docs. In this case, it would be:

dropmissing(df, :a)

6 Likes