How to implement promotion for functions like `collect` and `map`?

I’m implementing more advanced promotion rules in FlexUnits.jl but I’m getting stuck with functions like collect and map. If I put quantities in a vector with missing values, promotion rules I made give me a fully inferred element type.

julia> v = [1ud"m/s", 2ud"m/s", missing*ud"m/s", 4*ud"m/s"]
4-element Vector{Quantity{Union{Missing, Int64}, Units{Dimensions{FixRat32}, AffineTransform}}}:
 1 m/s
 2 m/s
 missing m/s
 4 m/s

This was achieved using

function Base.promote_rule(::Type{Quantity{T1,U}}, ::Type{Quantity{T2,U}}) where {T1, T2, U<:AbstractUnitLike}
    return Quantity{promote_type(T1, T2), U}
end

However, if I try to map this multiplication out, I get a partially defined element type.

julia> v = map(x->x*ud"m/s", [1, 2, missing, 4])
4-element Vector{Quantity{T, Units{Dimensions{FixRat32}, AffineTransform}} where T}:
 1 m/s
 2 m/s
 missing m/s
 4 m/s

Even more unusual, if I have mixed values in the map (Float32 and Int64), promotion on map works as expected

julia> v = map(x->x*ud"m/s", [1, 2, Float32(3), 4])
4-element Vector{Quantity{Float32, Units{Dimensions{FixRat32}, AffineTransform}}}:
 1.0 m/s
 2.0 m/s
 3.0 m/s
 4.0 m/s

What do I have to do with “missing” in order to make map produce Quantity{Union{Int64,Missing}, U}?

1 Like

I just realized this sort of “parametric element” behavior also occurs with ‘map’ on StaticArrays

julia> va = map(x->x.*SVector{2}(1.0,2.0), [1,2,missing,4])
4-element Vector{SVector{2}}:
 [1.0, 2.0]
 [2.0, 4.0]
 [missing, missing]
 [4.0, 8.0]

when a more optimal result would be Vector{SVector{2,Union{Missing,Float64}}

julia> SVector{2,Union{Missing,Float64}}.(va)
4-element Vector{SVector{2, Union{Missing, Float64}}}:
 [1.0, 2.0]
 [2.0, 4.0]
 [missing, missing]
 [4.0, 8.0]

This might have something to do with promote_typejoin_union(T) at julia/base/array.jl at 966d0af0fdffc727eb240e2e4c908fdd46697e57 · JuliaLang/julia · GitHub

1 Like

It would be interesting to test if this behaviour is still observed with WIP: The great pairwise reduction refactor by mbauman · Pull Request #58418 · JuliaLang/julia · GitHub

I think this may be intentional, to prevent type proliferation. That is, method instance proliferation/compile time explosion.

In any case, the behavior does not seem specific to Missing:

julia> struct S{P}
           v::P
       end

julia> map(x -> S(x), [1, 2, "", 4])
4-element Vector{S}:
 S{Int64}(1)
 S{Int64}(2)
 S{String}("")
 S{Int64}(4)

That’s what I would have expected from this example because given these definitions

Base.promote_rule(::Type{S{T1}}, ::Type{S{T2}}) where {T1,T2} = S{promote_type(T1,T2)}
Base.convert(::Type{S{T}}, s::S) where T = S{T}(convert(T, s.v))

The promoted result would be

julia> promote_type(S{String}, S{Int64})
S{Any}

But it seems that “collect” doesn’t use promote rules to narrow down the types beyond the top level, which often results in needless abstract containers.

julia> collect(map(x -> S(x), (1, 2, 3.0, 4)))
4-element Vector{S}:
 S{Int64}(1)
 S{Int64}(2)
 S{Float64}(3.0)
 S{Int64}(4)

julia> map(x -> S(x), Union{Int64,Nothing}[1, 2, nothing, 4])
4-element Vector{S}:
 S{Int64}(1)
 S{Int64}(2)
 S{Nothing}(nothing)
 S{Int64}(4)

However, promotion at the top level seems to work and produces a union

julia> S(x::Missing) = missing
julia> map(x -> S(x), Union{Int64,Missing}[1, 2, missing, 4])
4-element Vector{Union{Missing, S{Int64}}}:
 S{Int64}(1)
 S{Int64}(2)
 missing
 S{Int64}(4)

I find this behaviour inconsistent and confusing; now that I think of it, I think I was burned by this issue once before. However, vcat does use promote and produces consistent results

julia> vcat(map(x -> S(x), (1, 2, 3.0, 4))...)
4-element Vector{S{Float64}}:
 S{Float64}(1.0)
 S{Float64}(2.0)
 S{Float64}(3.0)
 S{Float64}(4.0)

julia> vcat(map(x -> S(x), (1, 2, nothing, 4))...)
4-element Vector{S{Union{Nothing, Int64}}}:
 S{Union{Nothing, Int64}}(1)
 S{Union{Nothing, Int64}}(2)
 S{Union{Nothing, Int64}}(nothing)
 S{Union{Nothing, Int64}}(4)

julia> vcat(map(x -> S(x), (1, 2, missing, 4))...)
4-element Vector{Union{Missing, S{Int64}}}:
 S{Int64}(1)
 S{Int64}(2)
 missing
 S{Int64}(4)

However, splatting vcat is only feasible if we know the collection length. Is there a way to call “map” with specifying the output type so that we can manually set a union instead of an abstract container? Something like

map(x->S(x), S{Union{Int64,Nothing}}, itr)

I just realized I could use a typed list comprehension

julia> S{Union{Nothing,Int64}}[S(x) for x in [1, 2, nothing, 4]]
4-element Vector{S{Union{Nothing, Int64}}}:
 S{Union{Nothing, Int64}}(1)
 S{Union{Nothing, Int64}}(2)
 S{Union{Nothing, Int64}}(nothing)
 S{Union{Nothing, Int64}}(4)

This is a bit messier but works; still, I wonder why promote_typejoin_union(T) is used instead of something like promote_type_union(T) that just uses promote rules. Or why there isn’t a documented method on map that allows you to specify an output type argument that allows you to collect to a narrower type and preserve type stability.

the behavior is unchanged for this example

map(f, c) is roughly equivalent to collect(Iterators.map(f, c)). collect also allows specifying the element type explicitly:

collect(T, Iterators.map(f, c))

There’s also my package, Collects.jl: