Hi, given this DataFrame :
A=DataFrame(a=[1:2;missing;4], b=1:4)
4×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼─────────┼───┤
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 3 │ missing │ 3 │
│ 4 │ 4 │ 4 │
eltypes(A)
2-element Array{Type,1}:
Union{Int64, Missings.Missing}
Int64
how can I remove missings so to get :
3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 3 │ 4 │ 4 │
eltypes(ans)
2-element Array{Type,1}:
Int64
Int64
Thanks for help.
swt30
2
Use completecases
to get the rows that are not missing, then index into A:
A[completecases(A), :]
7 Likes
swt30
3
Oh and then if you want the column types to not allow for missing values, you will need to call disallowmissing!
on the result as well. So:
julia> A = DataFrame(a = [1:2;missing;4], b = 1:4)
4×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼─────────┼───┤
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 3 │ missing │ 3 │
│ 4 │ 4 │ 4 │
julia> eltypes(A)
2-element Array{Type,1}:
Union{Int64, Missings.Missing}
Int64
julia> B = A[completecases(A), :]
3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 3 │ 4 │ 4 │
julia> eltypes(B)
2-element Array{Type,1}:
Union{Int64, Missings.Missing}
Int64
julia> disallowmissing!(B)
3×2 DataFrames.DataFrame
│ Row │ a │ b │
├─────┼───┼───┤
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 3 │ 4 │ 4 │
julia> eltypes(B)
2-element Array{Type,1}:
Int64
Int64
6 Likes
dropmissing
and dropmissing!
are slightly more convenient ways of doing this.
6 Likes
What’s the nicest way to filter missing
s out based on a single column? E.g. if the original example were
│ Row │ a │ b │
├─────┼─────────┼─────────┤
│ 1 │ 1 │ missing │
│ 2 │ 2 │ 2 │
│ 3 │ missing │ 3 │
│ 4 │ 4 │ 4 │
And I wanted to filter out the missing
s only in column a so that I got
│ Row │ a │ b │
├─────┼───┼─────────┤
│ 1 │ 1 │ missing │
│ 2 │ 2 │ 2 │
│ 3 │ 4 │ 4 │
Found the answer in the docs. In this case, it would be:
dropmissing(df, :a)
6 Likes