Spooky Missing action at a distance


#1

I understand why this happens, but it surprised me. Some complex code that was previously type-stable became unstable after upgrading unrelated dependencies. I’m not really sure what I think about this - I’m just posting it here for the sake of discussion.

julia> foo() = x != 1
foo (generic function with 1 method)

julia> @code_warntype foo()
Variables:
  #self# <optimized out>

Body:
  begin 
      SSAValue(0) = Main.x
      $(Expr(:inbounds, false))
      # meta: location operators.jl != 129
      SSAValue(1) = (Core.typeassert)((SSAValue(0) == 1)::Any, Base.Bool)::Bool
      # meta: pop location
      $(Expr(:inbounds, :pop))
      return (Base.not_int)(SSAValue(1))::Bool
  end::Bool

julia> using Missings

julia> @code_warntype foo()
Variables:
  #self# <optimized out>

Body:
  begin 
      return (Main.x != 1)::Union{Bool, Missings.Missing}
  end::Union{Bool, Missings.Missing}

#2

I think that in general this has to happen when new methods are added. The only alternative I can see would be globally enforcing return types on all methods of a particular function which is definitely not going to happen. It should go without saying that if you tell the compiler the type of x, for example by doing const x = 1, the return type goes back to being type-stable even in the presence of Missings.


#3

This has nothing to do with Missings per se, you can get something similar with eg

julia> foo(x::Int) = 1
foo (generic function with 1 method)

julia> bar() = foo(x)
bar (generic function with 1 method)

julia> @code_warntype bar()
Variables:
  #self# <optimized out>

Body:
  begin 
      return (Main.foo)(Main.x)::Int64
  end::Int64

julia> foo(x::Float64) = 1.0
foo (generic function with 2 methods)

julia> @code_warntype bar()
Variables:
  #self# <optimized out>

Body:
  begin 
      return (Main.foo)(Main.x)::Union{Float64, Int64}
  end::Union{Float64, Int64}

Don’t use globals.


#4

Don’t use globals.

Sure. Globals are just the simplest example I could come up with.

I actually ran into this in the output of a query compiler. For some reason
queries were allocating more memory if they were run on data had been
loaded from disk. Turns out this is because I have some Any==Any
comparisons when data is pulled from the execution graph and suddenly those
are now inferred as Union{Bool, Missing} which causes boxing and generic
calls downstream.

As Expanding Man said, this is probably an unavoidable consequence of
Julia’s design. I’m not demanding something be done. This was just “hey
this caught me out, be aware, sometimes == doesn’t return a boolean,
depending on what packages you have loaded”.


#5

It would be interesting to see an MWE of what you are doing. I expect that you would need a (potential) missing somewhere to infer a Union{Bool, Missing}, and it is not clear to me why you need that in a query compiler (or you could have found a bug in inference).

As a quick fix, you could use a different comparison (possibly defining your own), that always returns boolean.


#6

I expect that you would need a (potential) missing somewhere to infer a Union{Bool, Missing}

The potential Missing comes from Missing <: Any, just like in the first example I posted.

The query code is in theory type-stable, but sometimes inference bails out (eg from hitting MAX_TYPE_DEPTH if I nest too many closures) and I get return_value::ANY and then Union{Bool, Missing} down the line.

Again, I’m not presenting this as a problem to be fixed. I know how to fix my own code and the changing inference is a necessary consequence of Julia’s design. I’m just making a note of something that confused me for a while, so that if other people run into it they can find the explanation and skip the debugging time.


#7

My point was that, in a context where you are not using globals and the compiler can infer that the type of the values you are comparing cannot be Missing, this whole issue should go away (apart from the compiler not being smart enough to prove something, but v0.7 is becoming amazing in that respect).

Are you relying on the result of type inference in any way?