How smart is the 0.7 compiler?


#1

From recent postings in this forum, I conclude that the 0.7 compiler has many new optimizations, and I would be interested if someone knowledgeable could explain their reach or else point to a PR in GitHub that describes them. Three in particular come to mind:

  • Unions: AFAIK: In 0.6, an array of Union objects is inefficient because each entry of the array is boxed. In 0.7, some Union objects are handled with flags instead of boxes.

  • Tuples and immutable objects containing non-bits types. If (a,b) is a tuple and a is a non-bits type, then the tuple in general is heap-allocated. But even as long ago as 0.4, this wasn’t always true, e.g., if the tuple is part of a list of arguments to a function invocation f((a,b),c,d) then no heap object was allocated. What are cases in 0.7 where heap-allocation of an object containing non-bits types is avoided?

  • Propagation of constants. In 0.6, the mechanism to propagate constants for compilation into efficient code is the Val type. I’ve read in this discourse that constant propagation is much more powerful in 0.7. Is there a PR that describes its reach?


#2

Here’s one of the relevant PRs for the handling of small unions: https://github.com/JuliaLang/julia/pull/20593 and one for inter-procedure constant propagation: https://github.com/JuliaLang/julia/pull/24362


Why this function is type unstable
#3

I would describe the 0.7 compiler as a “very stable genius”. It’s hard to provide specific PRs for these changes because most of the improvements are spread out over dozens of PRs, each of which tackles part of the problem. This is one of the reasons compiler improvements tend to happen “all at once”: the last PR just capitalizes on years worth of preceding work.

  • Unions: small isbits unions like Union{Nothing, Int} or Union{Float64, Int} can be handled well in code generation most of the time and can be stored efficiently inline in arrays with a separate bitvector of indicators storing which of the union types each value has.

  • Tuples and immutables: hard to say, there’s some cutoff around 16, I think for tuples, and other cases kind of depend on compiler details. If you’ve got a pathological case, please do report it. Keep in mind that for big or complex immutable values, passing them around by value is often not actually better.

  • Propagation of constants: this one does have a single fairly clear PR – https://github.com/JuliaLang/julia/pull/24362.


#4

Loosely related: is it possible to mmap this? I could not figure it out.


#5

That sounds pretty insulting to the v0.7 compiler (given the stability and demonstrated IQ of the person who used that phrase to describe himself!) :nerd_face::rofl:

Yes, it is pretty unstable in the sense that it is changing very rapidly, and I recall somebody in the core team saying that the compiler wasn’t actually very smart (yet!) (things are so fast because of the underlying design of the language, not because of lots of compiler tricks).

I do think it does a pretty darn good job, esp. in v0.7, although yes, it can certainly do better in the future.


#6

Yes, I believe it should be possible. @jameson or @quinnj should be able to say more on that.


#7

I think that time is past and it now is truly pretty stable, especially as I’ve been keeping an eye on the milestones on github.

I have to confess that I am getting slightly worried about very long compile times. Having come from C++ I never care about this myself (except in the pathological case of the plotting packages which as far as I know is the result of unfortunate design decisions and is fixable in principle), but my colleagues who use Python seem pretty sensitive to this for some reason. Any idea if there will be any effort to improve compile times before 1.0?

On a distinct, but probably related note, any idea what causes the REPL so long to come up in 0.7? There is a minor, but definitely noticeable, probably 500ms delay until the prompt comes up. Might it be fixable? Just anticipating the barrage of complaints we will get when it’s in beta.


#8

Hmmmm, haven’t thought much about this surprisingly. I know they serialize and deserialize just fine. My guess is that it’s currently not possible because Mmap.mmap checks that your array element type is isbits, which won’t be true for Union{Int, Nothing}. Now, we could probably make an exception for isbits union element types, so then it would just be a question of if unsafe_wrap happens to do the right thing underneath (or if it needs a little help). It’d certainly be worth opening an issue about so we can keep thinking through the ramifications.


#9

opened an issue about it: