Any benchmark of Julia v1.0 vs older versions


#1

Hello.

Where can I find a comparison of the speed of Julia v1.0 vs other versions such as v0.7, v0.6, v0.5 or against other languages?


#2

Please don’t ask for it, we’re happy with Julia as is at the moment, reaching 1.0 itself was a big achievement. Besides, benchmarks are never done right, the old one is already flawed in some ways. Julia 1.0 was meant to achieve language API stability and the focus now should be on getting the echo system catching up. 1.x releases were planned to focus more on compiler optimizations. I don’t see a benefit for such a micro benchmark now, especially that Julia turned out to be more efficient for large multi-file projects.


#3

You’re right of course – the focus was on API stability, not optimizations, but…
Julia 1.0 and 0.7 (which, Juan, are essentially the same, except dep warnings in 0.7 are errors in 1.0) are definitely faster than 0.6. It’s pretty common to see a free 10% improvement or more, but it varies.
There was also a thread a few months ago referencing an econ article that benchmarked a bunch of languages, including Julia 0.2.
Julia 0.6 did much better than Julia 0.2 relative to a few other languages. Here’s the thread: A Comparison of Programming Languages in Economics

Julia 1.0 also starts much faster than Julia 0.6. Fantastic work all around.


#4

I was trying to decide what version to use now.
I know it’s a great achievement, and it will push people to use Julia.

Anyway I’m not only interested on the speed of v1.0 but on seeing the evolution of all versions till now.


#5

The microbenchmark results for 1.0 should be up on julialang.org within a few days. 0.7 was a little slower than 0.6, as measured by geometric mean of the microbenchmarks. But keep in mind that’s a not-necessarily-representative set of benchmarks, and that lots of optimization will surely occur for 1.x now that the syntax and Base functionality is solid.


#6

On julialang.org I can see a plot with the results for several languages.
But there is just one result for Julia.
How can we find the results broken down by Julia’s version?


#7

I’ve yet to find out what optimization or improvement we’re affected by, but Optim seems to be quite solidly 1.5 to 2 x faster between 0.6.4 and 1.0.


#8

Unless you have mission-critical software that already runs smoothly on v0.6, I would recommend transitioning v1.0 and trusting that the occasional performance regression will be fixed, especially if you are willing to help with an MWE.

That said, I find I get a 10-50% improvement “for free”. Occasionally even more, but that comes from consciously using idioms in v0.6 which were suboptimal there but expected to work better in v0.7 onwards (small unions).


#9

I can probably retrieve and post comparative data for 0.4 though 1.0 here, probably by the end of the week.


#10

is between 0.6 and a fairly late 0.7 version.

Also, see https://github.com/JuliaLang/julia/pull/27030.


#11

Hearing that they showed a slight regression in 0.7/1.0 makes me more inclined to think those benchmarks aren’t representative of code “in the wild” than think there actually was a regression on average.
As a simple example, the benchmarks don’t use ‘@inbounds’, which prevents auto-vectorization while indexing into arrays. And Julia’s (LLVM’s?) block vectorizer got better. This isn’t seen by the benchmarks. Constant propagation through function boundaries, improved handling of small unions, better inlining heuristics, etc… I doubt I know half the improvements.


#12

As another data point: benchmarks for RigidBodyDynamics improved 15-30 percent by switching from 0.6 to 0.7/1.0. That code was already optimized pretty well.


#13

FWIW, here are microbenchmark results for julia-0.6.4, 0.7.0, and 1.0.0. There’s some improvement in matrix_statistics and recursion_fibonacci and some degradation in parse_integers and print_to_file.

@kristoffer.carlsson has already found a factor of two improvement for the integer parsing library code (https://github.com/JuliaLang/julia/pull/28661) which should get parse_integer back down to where it was or better. If there’s a similar fix for printing ints then the 1.0.x microbenchmarks will show slight improvement over 0.6 overall. Of course the microbenchmarks are in no way a representative sample of real-world code.

0.6.4 0.7.0 1.0.0
iteration_pi_sum 27.37 27.67 27.66
matrix_multiply 70.24 70.22 70.32
matrix_statistics 8.513 7.286 7.323
parse_integers 0.132 0.221 0.218
print_to_file 6.860 10.833 10.870
recursion_fibonacci 0.0406 0.0302 0.0302
recursion_quicksort 0.248 0.261 0.259
userfunc_mandelbrot 0.0565 0.0527 0.0527

#14

Nice.
I can see it’s quite stable except strangely for parse_integers and print_to_file that now need double time.


#15

Wow. That’s incredibly consistent especially considering that the optimizer was completely rewritten.

Of course it would be very interesting to know what happened to parsing and printing.


#16

https://github.com/JuliaLang/julia/pull/28670 should improve print_to_file as well.


#17

That’s exactly what I meant in the first comment in this thread, these micro benchmarks don’t reflect the actual improvements that have been made. In my real-world large codes I see about 20 - 50% improvement moving from 0.6.4 to 1.0. I’m still happy though because the most important test in my opinion, matrix_statistics, got improved. That said, this specific benchmark tests looping performance rather than matrix statistics, I didn’t see one doing statistics on a tiny 5-by-5 matrix before, choosing a medium-sized, more practical matrix would be fair.


#18

Is there any prediction on how fast can Julia be compared to C in the future?
I mean theoretical limits due to the way in manages data and garbage and access memory.

What areas can be improved?
What areas are already state-of-the-art?


#19

It’s as fast as C if you don’t trigger the garbage collector. It’s common to pre-allocate memory, or just use stack memory, so that it doesn’t get triggered in the most sensitive parts of your code. A really cool example I saw recently, for small dimensional optimization problems:


Doesn’t allocate any memory, and runs incredibly fast. For small dimensional problems, you’d be hard pressed to find anything faster.

I think Julia in practice will often be faster than C, because it is easier to specialize code for a given problem.
Given two libraries, one written in C, and one in Julia, I wouldn’t bet on the C library being faster.
As an example, here are two libraries by the same author (someone known for writing high performance software https://en.wikipedia.org/wiki/Steven_G._Johnson):
https://github.com/stevengj/Cubature.jl # Written in C, has the advantage that it also offers p-cubature
https://github.com/stevengj/HCubature.jl # Written in pure Julia

# session started with -O3 --depwarn=no
julia> using HCubature, Cubature, StaticArrays, BenchmarkTools

julia> f(x) = exp(-x' * x/2)/2
f (generic function with 1 method)

julia> @btime HCubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), rtol=1e-8)
  2.517 ms (63938 allocations: 1.70 MiB)
(3.1415926534311005, 3.141588672705139e-8)

julia> @btime Cubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), reltol=1e-8)
  5.425 ms (193752 allocations: 8.28 MiB)
(3.1415926534311027, 3.141588673692509e-8)

julia> @btime HCubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), rtol=1e-12) # Julia
  62.269 ms (1448586 allocations: 36.86 MiB)
(3.1415926535897993, 3.1233157511412803e-12)

julia> @btime Cubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), reltol=1e-12) # C
  146.159 ms (4315742 allocations: 184.39 MiB)
(3.1415926535897976, 3.1411238232300217e-12)

Chris Rackauckas also explains the advantages Julia’s late compilation provides here::

I also gave an example of writing optimized kernels in Julia using SIMD intrinsics in pure Julia here: matmul post . At the end, I compared multiplying two 200x200 matrices with that Julia code with OpenBLAS, which has kernels written in assembly: Skylake-X OpenBLAS kernel. Julia took 147.202 μs, OpenBLAS took 335.363. (To be fair, some of that difference was overhead, that I skipped in Julia by taking care of all that at compile time – but that again presents an advantage of Julia’s late compilation in practice).

It’s possible to write slow code in any language, but it’s also definitely possible to write among the fastest code in Julia. More than that, the fastest generic code for libraries aimed at end users who’re going to try and do who-knows-what?


#20

Do you think we wil see OpenBlas, MKL and similar libraries written completly in Julia?