Why type instability?

question

#1

We know that type instability is bad for performance. So why Julia allows type instability? That is, Julia could throw an error whenever a variable changes type.


#2

One reason: because Julia aims to be an easy to use language.


#3

How does type instability make it easier?


#4

In general there are situations when this kind of type unstable code is needed:
f() = rand() < 0.5 ? 1 : 1.0
The typical example is that you read data from an external source and do not know its type at compile time.


#5

Not all code is performance critical. Not having to worry about types and just having things resolve dynamically can be extremely convenient.


#6

If you have ever used @code_warntype you know that whether a function is type-stable isn’t always as obvious as it first appears. The larger reason to allow type instability is that Julia doesn’t really have pointers (well, it has them, but their intended use is only for interacting with C or some special array handling). The issue with this is as follows: suppose you have two types Pion <: Meson and Kaon <: Meson and you want a function that returns a Meson. In C++, this can be achieved by returning a Meson pointer. The equivalent behavior in Julia is to return a reference of type T <: Meson. If Julia were to fully enforce type stability, functions would be required to return a specific Meson. So, it goes a bit beyond whether Julia is “easy”.


#7

Im not saying you should enforce type stability at compile time. An error preventing change of type can be isued dynamically at run-time.


#8

In many cases it can be hard to exactly know the types that will come out. One major example is matrix factorizations. Each “matrix type” has its own Julia type, because it’s easy to see how specializing on matrix types can give you much faster algorithms (tridiagonal solvers, banded solvers, etc.). But the generic \ does not actually know which matrix type it needs to internally create. But that doesn’t matter: it can at runtime find out, factorize into the appropriate type, do a ~100ns dynamic dispatch to then solve \ using the specialized matrix type. 100ns is peanuts compared to the rest of the computation, making this a very good idea.

For this to be completely internally type stable, you would have to:

  1. Always have the user decide what factorization to use
  2. Only use one factorization
  3. Always factorize to the same type, and at runtime use checks for what matrix operations to perform in the resulting steps

(1) and (2) are restrictive design-wise (and are what you see in C APIs). (3) is slower and is what you see in something like MATLAB. An internal type-instability solves the problem beautifully, taking a single tiny internal dynamic dispatch (a “function barrier” if you will), to get the flexibility and performance.

And you’ll notice that in many cases, you want a design that is “type-instability + function barrier”. This allows you to make a high level algorithm choice which then drops down to an efficient compiled code, with a pretty small runtime check (the dynamic dispatch) in the middle. NLsolve + ForwardDiff and the dynamic choice of the chunksize is another example of this, where there’s a tiny runtime cost that then enters into an efficient code.

The design space is just much larger when you allow this. It’s actually a very unique feature in Julia that should be embraced and not avoided.


List of most desired features for Julia v1.x
#9

Although type instability is currently very bad for performance in Julia, I believe there is work going on already to improve that greatly for certain cases (such as when the type can be inferred to be a union of a small number of types, such as Union{MyType, Void}).
Currently, one can manually optimize code in those cases, i.e. checking to see if the value isa(x, Void)
As other people have stated, there are many cases where one can’t know the type in advance, for example, a value read from a JSON or XML file.


#10

FWIW, structures like DataFrames/DataTables could not be implemented at all without type instabilities (since the type of columns is on purpose not part of the type parameters).


#11

Is there official way to check if all the functions in a module are type stable?


#12

If you only care about type stability of the returned value (not of internal variables) there is @inferred, from Base.Test module:

julia> Test.@inferred 1 + 5
6

julia> Test.@inferred rand(["a", 1, 5.0])
ERROR: return type Float64 does not match inferred return type Any
Stacktrace:
 [1] error(::String) at ./error.jl:21

#13

Other people have given good reasons why this would in general be too restrictive but I’d love to see a macro that could be applied to specific functions in order to turn any dynamic dispatch into an error. Likewise a macro that turned any kind of allocation into an error.


#14

Just to put a name. It looks like it’s an issue with dependent type.

It isn’t simple in this context to ensure that type checking remains decidable.

Enabling both type and type reassignment may be a way to circumvent that.


#15

If Julia enforced type stability, it would be a statically typed language. This would not be a minor change to the language – it would literally be the single most fundamental change to the language that can possibly be made. Perhaps that’s what you want, but I suspect not. Decades of empirical evidence indicate that numerical and scientific users just love to work in dynamic languages – Matlab, R, Python, Mathematica, SciLab, etc. The appetite to work in static languages is considerably more limited – people who really desperately need performance will use C++ and Fortran for scientific and numerical work, but they then almost immediately wrap their static code in a dynamic language. The whole point of Julia is to not have to make that choice: to have the convenience of a dynamic language with the performance of a static one.

Moreover, Julia’s sophisticated dispatch and parametric types (union-all types, upper- and lower-triangular dispatch and where-types) are far beyond what is possible to statically type check with even the most state of the art static type system. Even something as basic as allowing integers as type parameters is bleeding edge for a static type system, requiring a full blown proof system, in which it is often necessary for the user to provide a proof that their type signatures are sound since it is impossible to do so automatically. Julia blows way past that level of expressiveness, largely by punting on type checking and separating the expression of complex types from the need to statically check them.


List of most desired features for Julia v1.x
#16

Variables changing type is not actually all that big a problem – we just haven’t optimized it at all until recently. On the flip side, it’s also not nearly sufficient to ensure type stability since one can easily write programs with unpredictable types without any variables that change type. What would be required to enforce type stability would be having a set of rules by which a computable type is assigned to every expression in a program, especially method definitions. Just making variables not change type is neither sufficient nor even particularly helpful. Having that as a blanket rule would mostly have the effect of forcing some high-level programs with particularly dynamic behavior to be written in much more annoying ways (e.g. with loosely typed Refs).


#17

Maybe this is just a curiosity, but I once wrote a function that is intentionally type unstable, in order to get better performance. The function in question is testzero from the module Zeros.jl. It returns different types depending on if the input is zero.

If I have a function f(a,b) which can be simplified in the special case of a equal to zero, and I know that there’s a high likelihood that a will in fact be zero, then I can call f(testzero(a),b) which will call a version of f that has been specifically compiled for a=0 if that is the case.

(It is possible that writing an explicit branch would be faster than relying on dynamic dispatch when there are only two methods to choose from - I haven’t tried. But if so, then I’m confident that Julia will make that optimization eventually.)


#18

I like that example. As you say, performance isn’t inherent – it’s a moving target depending on what optimizations a compiler happens to have. Another good example of where type instability can be useful is untagged unions. We use this for the match function: nothing indicates no match whereas a match object indicates a match. Part of the chaos in the data ecosystem stems from trying what amounts to a tagged union Nullable, which must be explicitly wrapped and unwrapped; this pattern turns out to be quite awkward and inconvenient to use, however. Work by Jacob Quinn and Jameson Nash towards making an utagged union approach (see Nulls.jl) with good performance, which will be much more convenient to use. This would be impossible with enforced type stability.


#19

Sure if static languages are C/C++/Fortran but modern languages like F# or Scala give you IMO the best of both worlds: flexibility , a repl for experimentation & (data) exploration, static type checking & AOT compilation, (reasonable) performance & high level code.


#20

Can you do it? I mean (for parts or all of your code), say with a (theoretical) macro (or CLI-switch)? Just curious. My feeling is that Julia is a statically typed language if you wish… i.e. dynamic a superset.

I also mean, could such a macro, enforce not just the one function, but all functions it will call (I would also like such a macro e.g. to enforce no-GC safe).