Is there a coalesce function for other types?

I found this nice function called coalesce and now I am wondering if there is a more general one. Is there a function like coalesce which can specify the missing value/type.

a = [NaN,missing,"",2,3.0]
coalesce.(a,0)
coalesce.(a,0,missing="") # fails
coalesce.(a,0,missing=NaN) # fails

Update: A good solution might be:

clean(a,junk,target) = identity.(replace(a, (junk .=> target)...))
clean(a,["",missing,NaN],0)
1 Like

It’s a one-liner:

julia> [isequal(x, "") ? 0 : x for x in a]
5-element Array{Any,1}:
 NaN
    missing
   0
   2
   3.0

You could easily write a function to do this if you wished.

I think replace(a, NaN => 0) is what you are looking for. For multiple replacements: replace(a, NaN => 0, missing => 0, "" => 0).
Unfortunately, the result still has type Vector{Any} instead of Vector{Float64}.

You can use identity to try an narrow the type after replacing, although in this case you’ll only get Real as there are Int and Float in the vector:

julia> identity.(replace(a, ([NaN, missing, ""] .=> 0.0)...))
5-element Vector{Real}:
 0.0
 0.0
 0.0
 2
 3.0
4 Likes

I like the trick of using identity to narrow down the type. :+1:
Is this the common way of narrowing down an array type?

clean(a,junk,target) = identity.(replace(a, (junk .=> target)...))
clean(a,["",missing],0)

Just a note to say that for singleton types, replace is able by itself to narrow the eltype:

julia> a = [missing, 1.1, NaN, nothing]
4-element Vector{Union{Missing, Nothing, Float64}}:
    missing
   1.1
 NaN
    nothing

julia> replace(a, NaN => 0, missing => 0, nothing => 0)
4-element Vector{Float64}:
 0.0
 1.1
 0.0
 0.0
2 Likes

I think this only works if the original vector is a Union and not of type Any

Indeed, replace allocates the final vector before going through the elements. So the “type narrowing” is done only using the initial type of the vector and the replacement pairs (e.g. for eltype(a) == Union{Missing, Float64} and pair missing => 0.0 , it’s possible to determine before any pass on the source vector that the destination vector will have eltype Float64 ).

I often broadcast ifelse in situations like this:

julia> @. ifelse(ismissing(a) || a == "" || isnan(a), 0.0, a)
5-element Vector{Real}:
 0.0
 0.0
 0.0
 2
 3.0

Ok, that only works on Julia 1.7, but I’m seriously excited that it works on 1.7!

5 Likes

This is excellent! Was not aware this was merged.

It sounds like the main “conditional evaluation” problem was solved. Does that mean we can expect to see broadcasted ? : in the future?

Unfortunately ? : is a bit different — it’s parsed directly to an if statement.

julia> Meta.parse("a ? b : c")
:(if a
      b
  else
      c
  end)

So that’s why we didn’t deprecate its use within @. expressions like we did || and && long ago (which is what made space for the above). Making arbitrary if clauses participate in broadcast is a much bigger thing.

I’m curious why ? : goes to if else rather than ifelse?

ifelse(x, a(), b()) will evaluate both a() and b(), whereas x ? a() : b() will not. So the parsing of ? : is for performance.

2 Likes