I’m probably doing something dumb here, but I have a DataFrame and I want to get all rows for which a particular column (which is called :Level
and is type Union{Int, Missing}
) is not missing
. My first two tries (using DataFramesMeta.jl) haven’t worked:
julia> standards = @subset(df, filter(x->true,skipmissing(:Level)))
ERROR: ArgumentError: length 10 of vector returned from function #478 is different from number of rows 20 of the source data frame.
# Long stacktrace that doesn't seem relevant here
julia> standards = @subset(df, !.(ismissing.(:Level)))
ERROR: syntax: invalid identifier name "."
Stacktrace:
[1] top-level scope
@ none:1
What am I missing
(pun intended)?
You can do @rsubset
for row-wise operatoins
julia> using DataFramesMeta
julia> df = DataFrame(Level = [1, 2, missing, 4]);
julia> @rsubset df !ismissing(:Level)
3×1 DataFrame
Row │ Level
│ Int64?
─────┼────────
1 │ 1
2 │ 2
3 │ 4
The first one errors because the returned value from a @subset
operation has to be a boolean vector of the same length as the original data frame.
In the second one, you have problems with the broadcasting syntax, try
julia> @subset df (!).(ismissing.(:Level))
3×1 DataFrame
Row │ Level
│ Int64?
─────┼────────
1 │ 1
2 │ 2
3 │ 4
another option is just:
subdf = df[.! ismissing.(df.Level),:]
All of these work. Thanks to both! I think I’ll mark pdeffebach’s answer because it’s first. Thanks again!