Does Julia syntax do enough to encourage type stability?


#1

Having used Julia on a fairly regular basis for almost a year now, one of the most common causes I see of unnecessarily poor performance is type instability. It’s wonderful that Julia is dynamically typed for cases where that sort of thing is useful, but the fact that Julia is, after all, dynamically typed creates lots of opportunities for users to unknowingly create sub-optimal code. Having been brought up on mostly C++ and a little Fortran, I tend to be quite aware of the types of the objects in my code, but in the last few years I’ve been around an increasing number of people who are primarily familiar with Python both in the scientific and commercial communities. Some of these people tend not to think about types at all. (Indeed, I find Python’s approach of pretending types don’t exist at all to be a glaring absurdity in what is, for the most part, an otherwise fairly well-thought-out language.) I am also certainly guilty, even as a fairly experienced user, of occasionally writing type-unstable code where it doesn’t belong.

My question is this: do we think that the current Julia syntax does enough to encourage users to at least be aware of (if not enforce) the type stability (or instability) of their code?

To make this more concrete, let me make one not-very-serious suggestion that I haven’t put a huge amount of thought into. I love that we can now make type assertions on function output using, for example

function f(phi::AbstractFloat)::Complex{Float64}
    e^(im*phi)
end

but I wonder if it might be a good idea, to promote good habits, to require syntax like

function::Complex{Float64} f(phi::AbstractFloat)
   e^(im*phi)
end

so that, most functions as they are currently implemented would require a function::Any or f(x)::Any. The only use this serves would be to force the user to think “I’m writing a function, but it’s return type isn’t controlled, so I better not put it in any performance critical code.” Of course my above example opens a can of worms with parametric types, so I’m certainly not claiming it is the best possible implementation of this type of idea. (On the other hand, I don’t think requiring type assertions on all input arguments is particularly useful, because of multiple dispatch.)

Some possibly negative side-effects of this would be that it might encourage less experienced users to write overly type-specific (and, possibly, as a result overly complicated) code. It would also likely be a big turn off to people coming from Python, a crowd which, in my experience, mostly doesn’t want to be bothered with this sort of thing at all regardless of its impact on performance or stability. Another problem that I haven’t thought much about is what are the consequences (if any) of most functions having implicit convert calls coming at their end (like in my examples above).

These are just random musings, but I wonder if the core developers of Julia have had similar thoughts. Has there been any thought given to this in the past? I’m skeptical of this myself, because I’m always skeptical of changes that don’t add any real functionality, but it’s something that’s occurred to me a number of times in the course of my using Julia. It would be nice if someone could come up with a better specific suggestion than the one I made here.


#2

Most “users” don’t need to be aware of it. If you’re calling functions in a machine learning, differential equation, etc. library which takes the vast majority of the time in your script, then type-stability in the outer function isn’t necessary to get good performance since the functions will always specialize on the types seen, and the cost of a single or a few dynamic dispatches is fairly minimal (in the 10’s of nanoseconds range).

Most functions should stay broad because you don’t know what you’ll get. Arguably, your example should be Complex{T} where you don’t know what the T is until you know the input type, and it could be hard to compute what that output type would have to be in general because that T could be some immutable defined by the user (though it would still be deducible and type-stable). So I would say that almost all uses of this kind of type assertions would be overly type-specific, and would encourage libraries not to use it.

Finding type instabilities is fairly easy with @code_warntype. And sure, you do have to learn a few things to properly write a performant library, but that’s not a problem Julia can solve: you do actually have to learn the language if you want to do something complex. There’s no way around that.

And I don’t think going even to something like warnings for type-unstable functions is a good idea either, becuase type-instabilities and non-strictly typed things are useful. Especially when prototyping, which is something Julia excels at.

Essentially, Julia runs the spectrum from Python to C/FORTRAN (with some offshoots of functional programming / Lisp). Julia doesn’t care what part of the spectrum you are programming at, and I think that’s the right way to go. Some people/projects live in the MATLAB/Python/R “use some scripts regime”, others live in the “library building” regime of C/FORTRAN. Julia does both just fine.

Proposals like this kind of force the user to think in more of a C/FORTRAN way, increasing the verbosity and “header information” to help make sure you get speed… but that’s exactly why I don’t use C/FORTRAN anymore.


#3

I think it’d be great to get better IDE support for detecting type instabilities. I imagine it wouldn’t be much harder than running @code_warntype on a function and tagging variables that are type-instable.


#4

That’s a great idea!


#5

Can you give an example how to use the @code_warntype? I was thinking the same thing when I saw the initial post. Unfortunately, I didn’t get hang off the @code_warntype yet. I think this could be an addition to linter-julia or separate package, but the basic functionality is a lot like linter.


#6

Have you seen this section of the manual?

http://docs.julialang.org/en/stable/manual/performance-tips/#man-code-warntype


#7

I think it’d be great to get better IDE support for detecting type instabilities. I imagine it wouldn’t be much harder than running @code_warntype on a function and tagging variables that are type-instable.

I love the idea, but would this really work for e.g. a function f(x::Any)? What types for x will you automatically probe the stability for?


#8

I would just go off whatever @code_warntype says; it has it’s own internal algorithm for flagging variables. That way if there’s a false-positive, we submit an issue and everyone benefits from a better code_warntype algo.


#9

I don’t understand… if you use @code_warntype with the natural signature, then even identity would be flagged as type unstable…


#10

Possibly search the test cases til one if found that hits the function.
If that is stable then continue to look for another, and so forth.
Then in the GUI make a note of which test case revealed the type instability.

It is a bit complex, but not entirely insane.


#11

I don’t follow you here. Example?


#12
f(b, x) = b ? x + 1.0 : x

What arguments should the IDE pass to code_warntype? Does it try the entire product of combinations of all concrete subtypes of any? It’s type stable for Tuple{Bool, Float64}, but unstable for Tuple{Bool, Int}. What about this one:

g(b, x) = b ? x + one(eltype(x)) : x

It’s type stable for g(true, rand(3)) but not for g(true, view(rand(3,3), :, 1)).


#13

So for

g(b, x) = b ? x + one(eltype(x)) : x

I am suggesting that if the tests file contains:


@test length(g(false, rand(3,3)) == 9
@test length(g(true,  rand(3,3)) == 9
@test length(g(true, view(rand(3,3), :, 1)) == 3)
@test length(g(false, view(rand(3,3), :, 1)) == 3)
@test length(g(true, sprand(3,3,0.5)) == 9)
@test length(g(false, sprand(3,3,0.5)) == 9)
@test length(g(true, view(sprand(3,3,0.5), :, 1)) == 3)
@test length(g(false, view(sprand(3,3,0.5), :, 1)) == 3)

Then the types tested by code_warntype should be, in order:

  1. Bool and Array{Float64,2}
  2. Bool and SubArray{Float64,1,Array{Float64,2},Tuple{Colon,Int64},true}
  3. Bool and SparseMatrixCSC{Float64,Int64}
  4. Bool and SubArray{Float64,1,SparseMatrixCSC{Float64,Int64},Tuple{Colon,Int64},false}

It could also do a more general search of types that show up in the package/tests, and do some kind of fuzzing-test-link combination of those that are allowed, randomly combined say 10 times with the chance given by the types frequency without repetition.

Alternatively, could use (SnoopCompile)[https://github.com/timholy/SnoopCompile.jl] to generate the list of types to check.


#14

"I think it’d be great to get better IDE support for detecting type instabilities. "
I like this idea. Great!


#15

I really like your idea. Currently I don’t have enough experience of code_warntype. I can help establishing the server communication between Atom and julia. Here is the future version of julia-linter (which is just back end for linter): https://github.com/TeroFrondelius/linter-julia/blob/JSON/lib/linter-julia.coffee


#16

Lint can certainly do some of this, though at the moment to a limited extent. Using @code_warntype is fairly risky as it may require executing an arbitrary amount of other code first.


#17

Aside, in case someone reading this thread is not aware: there is the Base.Test.@inferred macro to help test that functions are type stable.


Why *wouldn't* one want to use @inferred in package tests?
#18

I agree this would be a great feature. The main tricky issue is how you map from lowered code back to source, but that’s a solvable problem and the pieces are already around in various places. The fact that we can be heuristic makes it easier, because we’re allowed to be wrong in unusual cases; e.g. there are multiple variables of the same name shadowing each other.

Doing it automatically, like a lint warning that you don’t have to think about, is harder for the reasons others have pointed out. But a common pattern is to define a function first and then test it with some example inputs. Juno can pick up a call to foo(1, 2) when you evaluate, then run the code_typed in the background. Then we can show you variable types on hover in the editor, or alert you to any non-concrete types.


#19

That sounds potentially fantastic. Especially if it can be made relatively unobtrusive, so as not to annoy/deter the large and important group of users who just want to create some code without worrying about performance and type stability.


#20

I really like this idea, and it’s much better than anything I originally had in mind.

If it’s implemented this’ll give me some incentive to start using Atom, which I’m still frustrated with because I’m still unable to make it sufficiently “terminal-like” for my preferences. (Lucky for me there is the julia-vim package.)