Help with type inference

I have an odd problem with type inference. I am writing an iterator which wraps a vector. The following is NOT an MWE as I cannot reproduce the error in a small example. However hopefully it can still lead to a solution.

function toyreduce(f, op, itr)
    A = itr.x
    a1 = A[1]
    a2 = A[2]
    v = op(f(a1), f(a2))
    for i = 3:lastindex(A)
        ai = A[i]
        if ai !== missing # && true
           v = op(v, f(ai))
        end
    end
    return Some(v)
end

x = map(1:100) do i
    rand() < .2 ? missing : rand()
end
sx = skipmissing(x)
toyreduce(identity, +, sx)

itr is my wrapping function and itr.v is a vector that may contain missing values. Assume for convenience the first two elements of itr.v are non-missing.

The function infers fine as is but when I uncomment the && true, inference fails. Is Julia having trouble with that last extra branch? Why would && true hurt the type inference?

Any insights welcome.

Thank you.

In another oddity, let’s say I use sum which calls toymapreduce. If I leave # && true commented out, sum infers fine. If I uncomment it, it does not infer fine.

But if I comment it out again, then toymapreduce infers fine but sum remains problematic. I am using Revise. Why isn’t sum picking up my changes to the function’s ability to infer?

I get Union{Some{Missing}, Some{Float64}} in both cases, using latest 1.4-rc and master. Not sure if you consider this fine or not, but they infer the same.

I am not quite sure why you need Some here though (unless some information was lost in condensing to an MWE).

I would suggest isolating the problem and just sending the method to the REPL without involving Revise (which I what I did to evaluate your code). I don’t see why inference should have anything to do with the latter, but if it does then that should be documented.

Thanks for the help. Indeed, the function is more complicated and difficult to boil down to an MWE. The Some is used in case everything is missing, or something like that. The actual function is Base.mapreduce_impl, and I don’t have a great understanding how it works, to be honest.

I tried to re-define the function at the REPL and the type inference is still gone. Still a bit of a mystery, I’m afraid.

The first step to solving it would be creating a reproducible example.

Until that happens, my first guess would be a corrupted environment (some code that you assume was evaluated wasn’t).

is !ismissing(a) the same as a !== missing?

Both are implemented by dispatch, and should produce equivalent results, but their definition is not the same.

i wonder if his condition a !== missing && true will have similar behavior if OP uses !ismissing instead.

You don’t need to wonder, as you can just try it.

1 Like

There is an issue on the performance difference: https://github.com/JuliaLang/julia/issues/27681

You can also add methods to ismissing, but you can’t for !==.

Note https://docs.julialang.org/en/v1/manual/performance-tips/#Checking-for-equality-with-a-singleton-1

1 Like

FWIW, on 1.4 the alternative versions with

  1. ai ≢ missing
  2. ai !== missing
  3. ai !== missing && true

appear to be equivalent in terms of speed and inference.

Okay, I have a reproducible example

function toyreduce_impl(f, op, itr, ifirst, ilast, blksize)
    A = itr.x
    if ifirst == ilast
        a1 = A[ifirst]
        if a1 === missing #|| false
            return nothing
        else
            return Some(Base.mapreduce_first(f, op, a1))
        end
    elseif ifirst + blksize > ilast
        # sequential portion
        local ai
        i = ifirst
        while i <= ilast
            ai = A[i]
            ai === missing || break
            i += 1
        end
        i > ilast && return nothing
        a1 = ai::eltype(itr)
        i += 1
        while i <= ilast
            ai = A[i]
            ai === missing || break
            i += 1
        end
        i > ilast && return Some(Base.mapreduce_first(f, op, a1))
        a2 = ai::eltype(itr)
        i += 1
        v = op(f(a1), f(a2))
        for i = i:ilast
            ai = A[i]
            if ai !== missing
               v = op(v, f(ai))
            end
        end
        return Some(v)
    else
        # pairwise portion
        imid = (ifirst + ilast) >> 1
        v1 = toyreduce_impl(f, op, itr, ifirst, imid, blksize)
        v2 = toyreduce_impl(f, op, itr, imid+1, ilast, blksize)
        if v1 === nothing && v2 === nothing
            return nothing
        elseif v1 === nothing
            return v2
        elseif v2 === nothing
            return v1
        else
            return Some(op(something(v1), something(v2)))
        end
    end
end

function toyreduce_impl_bad(f, op, itr, ifirst, ilast, blksize)
    A = itr.x
    if ifirst == ilast
        a1 = A[ifirst]
        if a1 === missing || false
            return nothing
        else
            return Some(Base.mapreduce_first(f, op, a1))
        end
    elseif ifirst + blksize > ilast
        # sequential portion
        local ai
        i = ifirst
        while i <= ilast
            ai = A[i]
            ai === missing || break
            i += 1
        end
        i > ilast && return nothing
        a1 = ai::eltype(itr)
        i += 1
        while i <= ilast
            ai = A[i]
            ai === missing || break
            i += 1
        end
        i > ilast && return Some(Base.mapreduce_first(f, op, a1))
        a2 = ai::eltype(itr)
        i += 1
        v = op(f(a1), f(a2))
        for i = i:ilast
            ai = A[i]
            if ai !== missing 
               v = op(v, f(ai))
            end
        end
        return Some(v)
    else
        # pairwise portion
        imid = (ifirst + ilast) >> 1
        v1 = toyreduce_impl_bad(f, op, itr, ifirst, imid, blksize)
        v2 = toyreduce_impl_bad(f, op, itr, imid+1, ilast, blksize)
        if v1 === nothing && v2 === nothing
            return nothing
        elseif v1 === nothing
            return v2
        elseif v2 === nothing
            return v1
        else
            return Some(op(something(v1), something(v2)))
        end
    end
end

julia> x = map(1:100) do _
           rand() < .2 ? missing : rand()
       end;

julia> sx = skipmissing(x);

julia> Missings.toyreduce_impl(identity, +, sx, 1, 100, 100)
Some(42.487664183551566)

julia> @code_warntype Missings.toyreduce_impl(identity, +, sx, 1, 100, 100)

If you run the above code on 1.4 you will get good type inference. The result will say Union{Nothing, Some{Float64}} somewhere.

However if you run

julia> @code_warntype Missings.toyreduce_impl_bad(identity, +, sx, 1, 100, 100)

You will get Any in some places and Some{_A} where _A (so Some(Any)) in other places.

However the only difference between toyreduce_impl and toyreduce_impl_bad is the addition of a || false on the 4th line of the function, which shouldn’t affect any behavior of the code.

cc @nalimilan on this. Let me know if I should file an issue with Julia base.

Edit: It seems like if you delete the last branch at the end. the else, things infer fine in toyreduce_impl_bad. I think that Julia is having trouble determining the types of objects that aren’t reached.

As a final update, my type inference issues are fixed by using a nested if, i.e.

    if ifirst == ilast
        a1 = A[ifirst]
        if a1 === missing 
            if true
                return nothing
            end
        else
            return Some(Base.mapreduce_first(f, op, a1))
        end
    elseif

I will file a bug with Julia if no one has any objections.

Thanks for isolating it. Yes, this looks like an issue and I can’t think of an obvious duplicate.

It isn’t that strange that type inference have a harder time with the || false because I don’t think it does very advanced control flow analysis with constants. So assuming it said || maybe_false() then you don’t know what type you would have in that block since you might enter it even if ai !== missing.

Is this kind of behavior the price we pay for short-circuited boolean operations? because it works fine with plain old if statements.

Issue filed here.