Unexpected behavior when using Impute.locf within @by

As the following code shows, i want to forward fill missing values use Impute.locf function, but just within the same :id

using DataFramesMeta, Impute

df = DataFrame(id = repeat(1:3, 2), value = [1,missing,3,4,missing,missing])

df = @chain df begin
    @by(:id, :value = Impute.locf(:value),	$(:))

# following code raises error too, so this doesn't seem to be DataFramesMeta's problem
# combine(groupby(df, :id), :value => (x -> Impute.locf(x)))

Unexpectedly, it raises

ERROR: AssertionError: !(all(ismissing, data))

this is clearly beacause there are all missing value under the same :id=2, but the following code

df = DataFrame(id = repeat(1:3, 2), value = [missing,missing,missing,missing,missing,missing])

df = @chain df begin
    @transform(:value = Impute.locf(:value))

completed with no error. It just leaves all values missing, which is the desired result

Row  β”‚ id     value   
     β”‚ Int64  Missing 
   1 β”‚     1  missing 
   2 β”‚     2  missing 
   3 β”‚     3  missing 
   4 β”‚     1  missing 
   5 β”‚     2  missing 
   6 β”‚     3  missing

My questions are:

  1. Is it a bug or a feature (for some concerns I don’t know)?
  2. How do I get the (grouped) results? Of course, the simpler the code, the better.

Thanks in advance!

Group 2 has only missing values so Impute.locf errors as there is no value that can be used for filling the data.

1 Like

I think the subtlety that OP is stumbling over is this:

julia> locf([missing])
1-element Vector{Missing}:

julia> locf(Union{Float64, Missing}[missing])
ERROR: AssertionError: !(all(ismissing, data))

So when a vector is all missing whether or not the imputation works depends on the type of the vector.

You can work around this by narrowing the type of the sub-vectors (although this is probably not great performance-wise):

julia> combine(groupby(df2, :id), :value => (x -> locf(identity.(x))))
6Γ—2 DataFrame
 Row β”‚ id     value_function
     β”‚ Int64  Int64?
   1 β”‚     1               1
   2 β”‚     1               4
   3 β”‚     2         missing
   4 β”‚     2         missing
   5 β”‚     3               3
   6 β”‚     3               3
1 Like