How do I drop only rows that are fully filled with missing values?
df = DataFrame(a = [missing,2,3,missing], b = [missing,5, 9, missing], c=[missing, 2,3,4])
just remove row 1.
I want to keep the other ones, even if there are some missings here and there.
The simplest is:
julia> filter(row -> !all(ismissing, row), df)
3×3 DataFrame
Row │ a b c
│ Int64? Int64? Int64?
─────┼──────────────────────────
1 │ 2 5 2
2 │ 3 9 3
3 │ missing missing 4
It is not fastest, but maybe it is good enough for your use case?
1 Like
With DataFrames.jl and DataFramesMeta.jl, respectively
julia> subset(df, AsTable(:) => ByRow(t -> !all(ismissing, t)))
3×3 DataFrame
Row │ a b c
│ Int64? Int64? Int64?
─────┼──────────────────────────
1 │ 2 5 2
2 │ 3 9 3
3 │ missing missing 4
julia> @rsubset df !all(ismissing, AsTable(:))
3×3 DataFrame
Row │ a b c
│ Int64? Int64? Int64?
─────┼──────────────────────────
1 │ 2 5 2
2 │ 3 9 3
3 │ missing missing 4
1 Like