Error in JuMP / dataframe: How to get rid of missing and replace with blanks in DataFrames?

I am running into an error and the error message suggests that the source is JuMP. But I think the source is missing reported in dataframe. I would like to get rid of missing so the dataframe shows those entries as blanks. All the useful functions I can see would replace the whole columns whereas I would like to iterate through the whole dataframe and replace all “missing”.

The code below works fine when I create a dataframe by setting out the data within the code . This has one blank entry as I use "" to define it.

using JuMP, Clp, DataFrames
function get_data()
    Relationship = DataFrame(ID = ["A1","A2","A4"], LINK_1 = ["A5","A6","A10"],
    LINK_2 = ["A3","A7",""], LINK_3 = ["A9","A11","A12"])
    ID_t =  collect(1:15)
    ID = Vector{String}(undef,15)
    for i in 1:15
        ID[i]="A".*string(ID_t[i])
    end

    PRICE = rand(10.0:100.0,15)
    Master = DataFrame(ID = ID,PRICE = PRICE)
    return Relationship, Master
end

    Relationship, Master = get_data()
price = Master.PRICE
    model = Model(Clp.Optimizer)
    @variable(model, 0<=x[Master.ID]<=1)
    @variable(model,t[1:length(Relationship.ID)])

 for (i, row) in enumerate(eachrow(Relationship))
    for key in values(row)
        if !isempty(key) 
            @constraint(model, t[i] == x[key])
        end
    end
end

Now, if I write the dataframes to CSV files and read it back in Julia, it shows the missing entries and I run into an error.

Relationship, Master = get_data()
    CSV.write("Relat_2.csv",Relationship)
    CSV.write("Master_2.csv",Master)

    Relationship = CSV.read("Relat_2.csv")
    Master = CSV.read("Master_2.csv")
    price = Master.PRICE
    model = Model(Clp.Optimizer)
    @variable(model, 0<=x[Master.ID]<=1)
    @variable(model,t[1:length(Relationship.ID)])

  for (i, row) in enumerate(eachrow(Relationship))
    for key in values(row)
        if !isempty(key) 
            @constraint(model, t[i] == x[key])
        end
    end
end
MethodError: no method matching iterate(::Missing)
Closest candidates are:
  iterate(!Matched::ExponentialBackOff) at error.jl:252
  iterate(!Matched::ExponentialBackOff, !Matched::Any) at error.jl:252
  iterate(!Matched::MathOptInterface.Bridges.Objective.Map, !Matched::Any...) at C:\Users\.julia\packages\MathOptInterface\ZJFKw\src\Bridges\Objective\map.jl:30
  ...
in top-level scope at test_example.jl:30
in isempty at base\essentials.jl:737

Initially, I got this error message on the actual problem, but I may have made some other errors
JuMP error

How can I replace missing with blanks (assuming that missing maybe in many columns in any rows)? I don’t want to delete whole columns or whole rows

Seems like just checking if key is missing should be enough?

In general, you can use both replace and coalesce to replace missing values with anything you want.

Thank you. This is what I came up with but your suggestion seems more efficient.

for col in names(Relationship)
          replace!(Relationship[!,col], missing =>"")
end

You should use replace rather than replace! since replace! won’t change the column type to exclude missing values.

Thank you. It took me a while to figure out how to use replace. I think what @quinnj suggested may be more efficient.

Now I have run into another issue when I try to multiply a numerical matrix with x
ERROR: MethodError: no method matching -(::CartesianIndex{1}, ::Int64)
This should reproduce the error:
ones(5,15)*x
Would you know how to multiply a tuple /dictionary type of vector with a matrix?

You would have to be clear about what output you expect to get. Note that above, x is not defined… so I can’t reproduce the error

In the original post, the code defines x as a JuMP variable, but the issue is more to do with basic usage of Julia involving multiplication of matrices (with non-integer indices) in Julia. For example, to do a sum comprehension, I would use sum(price*x[id] for (id, price) in zip(Master.ID, Master.PRICE). I don’t know how multiplication with a matrix would work.

The full code that reproduces the error is given below

using JuMP, Clp, DataFrames
function get_data()
    Relationship = DataFrame(ID = ["A1","A2","A4"], LINK_1 = ["A5","A6","A10"],
    LINK_2 = ["A3","A7",""], LINK_3 = ["A9","A11","A12"])
    ID_t =  collect(1:15)
    ID = Vector{String}(undef,15)
    for i in 1:15
        ID[i]="A".*string(ID_t[i])
    end

    PRICE = rand(10.0:100.0,15)
    Master = DataFrame(ID = ID,PRICE = PRICE)
    return Relationship, Master
end


    Relationship, Master = get_data()
    price = Master.PRICE
        model = Model(Clp.Optimizer)
        @variable(model, 0<=x[Master.ID]<=1)
        @variable(model,t[1:length(Relationship.ID)])
    
        for (i, row) in enumerate(eachrow(Relationship))
        for key in values(row)
            if !isempty(key)
                @constraint(model, t[i] == x[key])
            end
        end
    end
    ones(5,15)*x   # This gives the error
julia> x
1-dimensional DenseAxisArray{VariableRef,1,...} with index sets:
    Dimension 1, ["A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "A10", "A11", "A12", "A13", "A14", "A15"]
And data, a 15-element Array{VariableRef,1}:

I will post this as a separate question as I suspect it is arising from JuMP

Yeah sorry I have no idea how JuMP works so I cant be of much help.