How do I drop only rows that are fully filled with missing values?

How do I drop only rows that are fully filled with missing values?

df = DataFrame(a = [missing,2,3,missing], b = [missing,5, 9, missing], c=[missing, 2,3,4])

just remove row 1.

I want to keep the other ones, even if there are some missings here and there.

The simplest is:

julia> filter(row -> !all(ismissing, row), df)
3×3 DataFrame
 Row │ a        b        c
     │ Int64?   Int64?   Int64?
─────┼──────────────────────────
   1 │       2        5       2
   2 │       3        9       3
   3 │ missing  missing       4

It is not fastest, but maybe it is good enough for your use case?

1 Like

With DataFrames.jl and DataFramesMeta.jl, respectively

julia> subset(df, AsTable(:) => ByRow(t -> !all(ismissing, t)))
3×3 DataFrame
 Row │ a        b        c      
     │ Int64?   Int64?   Int64? 
─────┼──────────────────────────
   1 │       2        5       2
   2 │       3        9       3
   3 │ missing  missing       4

julia> @rsubset df !all(ismissing, AsTable(:))
3×3 DataFrame
 Row │ a        b        c      
     │ Int64?   Int64?   Int64? 
─────┼──────────────────────────
   1 │       2        5       2
   2 │       3        9       3
   3 │ missing  missing       4
1 Like

thanks, this one works.