Replacing missing values in dataframe-convert-type-union-float64-is-ambiguous

Hi, I am a new Julia user and I am having difficulty imputing missing values in my dataframe. The replacement value is dependent on the position of the column and other values in that row. Coming from a Java background, I adopted loops. i.e.,

for row_counter in 1:size(df,1)
        for column_counter in 1:size(df,2)
            if ismissing(df[row_counter,column_counter])
                next_val = get_next_non_missing_val(df[row_counter,:],column_counter) # get the next non missing value for that row
                if column_counter==0
                    if next_val==-1 # custom logic
                        next_val = 100
                    end
                    df[row_counter,column_counter]=next_val
                elseif column_counter==size(df,2)
                    df[row_counter,column_counter]=df[row_counter,column_counter-1]
                    
                else 
                    if next_val==-1 
                        next_val=df[row_counter,column_counter-1]
                    end
                    
                    df[row_counter,column_counter] = (df[row_counter,column_counter-1]+next_val)/2
                    println(df[row_counter,column_counter] )
                end
            end
        end
    end

However, I keep getting the following error:

ERROR: LoadError: MethodError: convert(::Type{Union{}}, ::Float64) is ambiguous. Candidates:
  convert(::Type{Union{}}, x) in Base at essentials.jl:169
  convert(::Type{T}, x::Number) where T<:Number in Base at number.jl:7
  convert(::Type{T}, arg) where T<:VecElement in Base at baseext.jl:8
  convert(::Type{T}, x::Number) where T<:AbstractChar in Base at char.jl:179
Possible fix, define
  convert(::Type{Union{}}, ::Number)

I believe this is due the fact that as my dataframe has missing values, its considered is considered as Union, but I don’t know how to resolve this issue. Can someone please help me and also advise if the methodology I adopted to impute the missing values is the efficient?

1 Like

You cut off the useful part of the stack trace, we don’t know where this error happened in your code either so it’s a bit hard to come up with ideas. On first glance it doesn’t seem to me that it’s in the code you displayed, maybe in a function you’re calling?

Thanks for the reply. Sorry for the incomplete stacktrace.
The error is thrown in the line:
df[row_counter,column_counter] = (df[row_counter,column_counter-1]+next_val)/2

This is the rest of the stacktrace:

Stacktrace:
 [1] convert(::Type{Missing}, ::Float64) at ./missing.jl:69
 [2] setindex!(::Array{Missing,1}, ::Float64, ::Int64) at ./array.jl:847
 [3] insert_single_entry!(::DataFrame, ::Float64, ::Int64, ::Int64) at /home/bumblebee/.julia/packages/DataFrames/yqToF/src/dataframe/dataframe.jl:520
 [4] setindex!(::DataFrame, ::Float64, ::Int64, ::Int64) at /home/bumblebee/.julia/packages/DataFrames/yqToF/src/dataframe/dataframe.jl:560
 [5] handle_missing_values(::DataFrame) at parser.jl:103

Aha so it appears that one of your columns contains only missing values, therefore it’s typed Array{Missing,1} and you can’t put a Float into it because floats can’t be converted to Missings.

You can convert the column to eltype Union{Float64,Missing} first and then it will work.

Thanks!
I have added the following before the for loop:

for name in names(df)
        if eltype(df[!,Symbol(name)])==Missing 
        df[!,Symbol(name)]=convert(Vector{Union{Float64,Missing}}, df[!,Symbol(name)])
        end
    end

It works, though not sure if this is the best approach. Any suggestions?

  1. You don’t have to add Symbol. DataFrames takes strings with indexing now
  2. You can do Vector{Union{Float64, Missing}}(x) instead of convert
  3. If you are reading data from a CSV you can specify the eltypes of certain columns on import, which is probably the most elegant solution.
1 Like

Thanks all for the help!