Fantastic progress in master branch!

I’ve recently switched to the v0.7 master branch from v0.6 for the development of a computation-heavy package I’m making, and I am very impressed! The transition was really easy. I essentially had to adapt the way I was using CartesianRange, which has now a different syntax, but the rest was all really smooth. But more importantly, I have been blown away by the efficiency gains. Even though the master branch is heavily in flux still, there must have happened something cool in the internals, because critical (fully type-stable) parts of my package that were doing like 54 k allocations for certain test benchmarks in v0.6 have magically gone down to 117 allocations in v0.7! Of course the performance has gone up at the same time. There are some instabilities and segfaults too, of course, but that’s to be expected. Overall I’m very excited about v0.7.

It would be great to know exactly what kind of wizardry is going on in the master branch, and what further progress to expect before 1.0, but I would probably not fully understand. I hear words like linear IR, but I’m afraid I lack the background to really get what it is about.

In any case I would like to enthusiastically congratulate the core developers for their amazing work. I imagine it must be quite stressful at the moment with the upcoming 1.0 release, so receive some big cheers from Spain, your efforts are much appreciated!

30 Likes

IMO, the amazing wizardry which could just blow everything away hasn’t even merged yet:

https://github.com/JuliaLang/julia/pull/24362

though one of the big deals merged yesterday:

https://github.com/JuliaLang/julia/pull/22194

8 Likes

I have been watching the NamedTuples development for a long time, so I was very happy to see this merge. Hopefully it will soon enable fast keywords, that would be huge!

Could you point us to any thread explaining what the IPO thing is about?

The examples from here are the kind that would benefit from it:

The basic idea is that if you hardcode some constant, like

qr(A,true)

then that true could be propagated and compilation can take it into account. This would allow one to have “not type-stable functions” with switches by booleans, yet compilation would know how to make it type-stable by pushing the constants all the way through. This kind of thing can also make branches disappear and other optimizations just by using more information at compilation time.

Julia currently optimizes constants within a function. If you do:

a = false 
if !a
  # do stuff
end

it’ll compile away that branch if a isn’t used anywhere else. And in some cases this is already special cased, like for tuples a[1] can be type-stable even when the tuple is heterogeneous (when indexing with a literal instead of a variable).

Inter-procedural constant propogation means that in general functions can push constants into the other functions inside of them, and optimize the whole chain together. An example of this would be if you hardcode a call by symbol in a NamedTuple:

data[:this_column]

There’s a getindex function on symbol that needs to be called to convert that symbol into an index data[i]. Well if you have this constant propogation, the compiler can find out that data[:this_column] always turns into data[i], and thus replace the symbols with the index (i.e. replace the constant with the constant result of using that constant in a function) at compilation time to make it faster. That’s seems to be the main impetus that’s pushing it right now:

tl;dr:

This plus the faster keyword arguments / NamedTuple change means you can have a crap ton of options, and if the user hardcodes in the options:

sol = solve(prob,Tsit5(),save_everystep=false,save_first=false, ...)

then Julia will compile a specific version of the function for that set of options and optimize on those choices. DifferentialEquations.jl has a few options, so I am quite excited to see this.

8 Likes

Thanks a lot! That begins to make sense. I assume it only applies to hard coded constants, not to user inputs?

So if someone does

function foo(a::AbstractArray{Bool})
  a[1] ? 1 : 2
end

and then calls foo([true, false]), will constant propagation allow to elide the ? check altogether?

if it’s written like that, i.e. a literal, then it can be propagated, and my understanding of the PR is that it will propagate it.

1 Like

I suspect a lot of the performance improvement had come from a change that @keno made plus follow up work by @yuyichao to take advantage of that change to improve performance. The original change made it possible to defer figuring out where objects need to be “rooted” so that garbage collection knows that an object cannot be safely freed until much later in the optimization process (during LLVM optimization), at which point it’s much easier to recognize cases where rooting isn’t actually needed at all, thereby allowing it to be avoided entirely. That, by itself doesn’t speed things up that much, but it opens the door for a lot of big improvements which Yichao has been taking good advantage of in a series of PRs. I believe the change may also be key to enabling @jameson’s PR above. I’ve said it before but it bears repeating: we’ve barely scratched the surface of the kinds of optimizations that we can do in Julia.

18 Likes

This is so exciting!!! Amazing work people.

The microbenchmarks improve from 1.09 to 1.05 in geometric mean relative to C, going from 0.6.0 to 0.7.0-DEV. Depending on how you look at it, either it’s a mere 4% improvement, or it’s cutting the overhead over C by a factor of two!

Those numbers are constant to three digits on repeated runs on an unloaded machine.

5 Likes

Thats pretty amazing. 4% doesn’t sound like a lot until you think about how fast this is converging to C speeds. If the claims that we’ve barely even scratched the surface of Julia’s optimization potential then perhaps its not too pie in the sky to imagine a future where we blow past C.

By the way, how is Julia compared to Go these days?

I find it mindblowing that, being already so damn close to C, you have still “barely scratched the surface” of possible optimisations. Julia is already so smart! At this rate we will not even need to care about type-stability in the future, ha!

I really hope that many more people would start seeing the potential here, and begin investing in Julia… As a language I’m not sure its getting all the attention it deserves yet, at least among my colleagues in physics. Everybody is just doing Python and Matlab wherever I look!

4 Likes

@Mason, this thread is pretty cool

It has updated the benchmarks of Julia against other languages. The big surprise for me, and some other people, has been SciLua, which apparently has a really sophisticated JIT compiler… It appears to be about as fast as Julia at the moment… without Julia scratching the surface :smiley:

1 Like

@pablosanjose

I really hope that many more people would start seeing the potential here, and begin investing in Julia… As a language I’m not sure its getting all the attention it deserves yet, at least among my colleagues in physics. Everybody is just doing Python and Matlab wherever I look!

Hopefully once 1.0 is out, people will have less reasons to be skeptical and be more open to moving over to Julia. There’s huge potential here, especially for physicists! I’m doing an exact diagonalization of the Hubbard model in Julia right now and its a treat.

It has updated the benchmarks of Julia against other languages.

I’ve read that thread but theres still no Go benchmarks as far as I can tell.

The big surprise for me, and some other people, has been SciLua, which apparently has a really sophisticated JIT compiler

Yeah that really surprised me too! The past few years really seem to be a bit of a JIT renaissance. Or maybe I’m just only paying attention now :stuck_out_tongue:

@Mason Go is back and the results are up on https://julialang.org/ and Julia Micro-Benchmarks . Go comes in at about 1.5 on the geometric mean. The discussion about the benchmarks updates gradually shifted from Discourse to GitHub. For what it’s worth the geometric means for all languages are

cputime	lang
1.000	C
1.054   Julia 0.7.0 [EDIT: originally reported as 1.045]
1.093	Julia 0.6.0
1.109	LuaJIT
1.496	Fortran
1.511	Go
2.745	Java
3.448	JavaScript
12.790	Matlab
14.121	Mathematica
16.610	Python
70.955	R
575.269	Octave

That’s merging the 0.7.0 result into the list from 0.6.0 and perhaps misremembering the last digit. Three digits are significant, and caveat that these numbers and rankings depend on the choice of benchmarks in the suite.

[EDIT: I did indeed misremember the digits for julia-0.7.0. The correct geoemetric mean is 1.054]

12 Likes

Wouldn’t it make sense to add a row at the bottom of the benchmarks table with the geometric mean of each column? That would help getting a better sense of the overall data present in the table.

Also, color-coding the entries red or green (for >1.0 or <1.0 of C) would help in this regard. I can send a PR adding these changes if there’s agreement.

5 Likes

For what it’s worth, I think those are good suggestions.

Putting the geometric mean at the bottom of the table is a great idea. I haven’t done it myself because the html table is produced by test/perf/micro/bin/table.pl, and I am perl-less. BTW, Stefan Karpinsky suggested rewriting that code in Julia so such changes would be easier for the Julia community. I haven’t had time for that yet.

Another fix that needs doing is printing -- for missing data rather than 0.0. There is one missing datapoint: print_to_file for JavaScript.

My feeling is that red/green color coding might be a bit much, visually. But it’s worth a try.

@waldyrious a PR on these would be great!

{EDIT: but getting a bit off-topic. Let’s continue this elsewhere, like https://github.com/JuliaLang/julialang.github.com/issues]

Ah, I didn’t realize that. Alas, I too am perl-less…

You mean Perl-free? :wink:

1 Like

I suppose if no one else gets around to it, I’ll port it myself since it seems that no one else around here has much Perl knowledge.

1 Like