Coalesce function not working with data frames n x n with n>=2



y=dataframe 6x6

I may be wrong but I tried to use coalesce function on a dataframe 6x6 and I get the error: no method matching iterate.
coalesce.(y,0) ==> Error
But if I try coalesce.(y[i],0) for each i it works.
any help?
Thank you


It might help if you could read PSA: make it easier to help you, and then post a MWE - it is currently unclear what you are trying to do.

From the REPL:

help?> coalesce
search: coalesce

  coalesce(x, y...)

  Return the first value in the arguments which is not equal to missing, if
  any. Otherwise return missing.


  julia> coalesce(missing, 1)

  julia> coalesce(1, missing)

  julia> coalesce(nothing, 1)  # returns `nothing`

  julia> coalesce(missing, missing)


Broadcasting on a dataframe doesn’t go through every cell, it goes though each row. Non of the rows are missing, so coalesce didn’t do anything. Also, it’s not mutating. Not sure if there’s a better way, but I often do

for n in names (df)
    df[n] = coalesce.(df[n], 0)


Could you clarify what operation you are trying to perform? Are you looking to replace each missing in a DataFrame with 0?


It’s been evoked several times to add an argument to disallowmissing! to do that. Should be easy to do.


Sorry if I wasn’t enough clear.
I have a multi-dimension dataframe y (size is 6x6 see picture)

To replace missing value by 0 in this data frame, I tried: coalesce.(y, 0) but I got an error message. The only way to do this is : [y[i] = coalesce.(y[i], 0) for i in 1:size(y,2)]
I would have thought that coalesce.(y, 0) will replace any missing value in the DataFrame which is not working, it only works for DataFrame vector but not DataFrame nXn
by doing : [y[i] = coalesce.(y[i], 0) for i in 1:size(y,2)]
I get the 0 instead of missings


Did you try what I suggested?


Your way works, but I think also allocates a vector. As I said, broadcasting over a dataframe applies the function to rows, not cells.


Even simpler and more efficient (note this is an in-place operation, but I understand this is what you want) is:

replace!.(eachcol(df, false), Ref(missing=>0))

And eachcol(df1, false) will be simply replaced by eachcol(df1) in the near future when the deprecation period for the old way that eachcol worked finishes.


Actually currently AFAIK you cannot broadcast an AbstractDataFrame now. I am not clear if it will be supported in the future, but if it will then it will be most probably row-wise as @kevbonham indicates.

For now you have eachcol and eachrow methods that you can call and they support broadcasting col-wise and row-wise respectively.


Ah, this is a good solution. I keep forgetting about Ref() for stuff like this.


Note that doing an in-place operation for this isn’t necessarily a good idea since it will keep the Union{T,Missing} element type even if there are no missing values, which can hurt performance and doesn’t indicate that missing values are not supposed to be present.