Possible bug in dropmissing!

I am dropping rows containing missing values from a dataframe.

In 122607 rows, 29 rows contain at least one missing value. I can confirm this 2 ways. First, I wrote a function that counts rows containing 1 or more missing values. Second, using Julia functions, I can do:

dfnew = dropmissing(df)

dfnew has 29 fewer rows than df.

But, dropmissing! reports this error:

ERROR: MethodError: no method matching deleteat!(::PooledArrays.PooledArray{String,UInt32,1,Array{UInt32,1}}, ::Array{Int64,1})

A mystery to me. I guess I won’t use dropmissing!.

Just confirming that:

df = dropmissing(df)

does work.

I suspect this is intended - I’ve seen a similar error with filter! recently when I load DataFrames without the new copycols=true key word argument. iiuc bunch of mutating functions now their errors to prevent unintended manipulation of the underlying vectors. You used to see things like this, which thankfully doesn’t happen anymore. This is at the expense of needing to explicitly copy columns, which I think is the right design.

3 Likes

One thing that might be useful is a more helpful error message in cases like this. It took me a while to figure out what was happening. Not sure how to catch situations like this though.

1 Like

This is not intended and should work. There is a PR https://github.com/JuliaComputing/PooledArrays.jl/pull/23 that fixes this.

3 Likes

Thanks.

I switched to Julia 1.1.1 and no error. It gets easier/faster to upgrade with each release, but I still dread it.

@macroexpand still doesn’t put the ScikitLearn.jl modules into right name space, but that is an unrelated problem.

Update PooledArrays.jl package to the latest version and all will work fine now.

1 Like

Just downloaded/installed. Will test on Julia 1.1.1.

Thanks again.