Hello.
Where can I find a comparison of the speed of Julia v1.0 vs other versions such as v0.7, v0.6, v0.5 or against other languages?
Hello.
Where can I find a comparison of the speed of Julia v1.0 vs other versions such as v0.7, v0.6, v0.5 or against other languages?
Please donāt ask for it, weāre happy with Julia as is at the moment, reaching 1.0 itself was a big achievement. Besides, benchmarks are never done right, the old one is already flawed in some ways. Julia 1.0 was meant to achieve language API stability and the focus now should be on getting the echo system catching up. 1.x releases were planned to focus more on compiler optimizations. I donāt see a benefit for such a micro benchmark now, especially that Julia turned out to be more efficient for large multi-file projects.
Youāre right of course ā the focus was on API stability, not optimizations, butā¦
Julia 1.0 and 0.7 (which, Juan, are essentially the same, except dep warnings in 0.7 are errors in 1.0) are definitely faster than 0.6. Itās pretty common to see a free 10% improvement or more, but it varies.
There was also a thread a few months ago referencing an econ article that benchmarked a bunch of languages, including Julia 0.2.
Julia 0.6 did much better than Julia 0.2 relative to a few other languages. Hereās the thread: A Comparison of Programming Languages in Economics - #4 by tkoolen
Julia 1.0 also starts much faster than Julia 0.6. Fantastic work all around.
I was trying to decide what version to use now.
I know itās a great achievement, and it will push people to use Julia.
Anyway Iām not only interested on the speed of v1.0 but on seeing the evolution of all versions till now.
The microbenchmark results for 1.0 should be up on julialang.org within a few days. 0.7 was a little slower than 0.6, as measured by geometric mean of the microbenchmarks. But keep in mind thatās a not-necessarily-representative set of benchmarks, and that lots of optimization will surely occur for 1.x now that the syntax and Base functionality is solid.
On julialang.org I can see a plot with the results for several languages.
But there is just one result for Julia.
How can we find the results broken down by Juliaās version?
Iāve yet to find out what optimization or improvement weāre affected by, but Optim seems to be quite solidly 1.5 to 2 x faster between 0.6.4 and 1.0.
Unless you have mission-critical software that already runs smoothly on v0.6
, I would recommend transitioning v1.0
and trusting that the occasional performance regression will be fixed, especially if you are willing to help with an MWE.
That said, I find I get a 10-50% improvement āfor freeā. Occasionally even more, but that comes from consciously using idioms in v0.6
which were suboptimal there but expected to work better in v0.7
onwards (small unions).
I can probably retrieve and post comparative data for 0.4 though 1.0 here, probably by the end of the week.
is between 0.6 and a fairly late 0.7 version.
Hearing that they showed a slight regression in 0.7/1.0 makes me more inclined to think those benchmarks arenāt representative of code āin the wildā than think there actually was a regression on average.
As a simple example, the benchmarks donāt use ā@inboundsā, which prevents auto-vectorization while indexing into arrays. And Juliaās (LLVMās?) block vectorizer got better. This isnāt seen by the benchmarks. Constant propagation through function boundaries, improved handling of small unions, better inlining heuristics, etcā¦ I doubt I know half the improvements.
As another data point: benchmarks for RigidBodyDynamics improved 15-30 percent by switching from 0.6 to 0.7/1.0. That code was already optimized pretty well.
FWIW, here are microbenchmark results for julia-0.6.4, 0.7.0, and 1.0.0. Thereās some improvement in matrix_statistics and recursion_fibonacci and some degradation in parse_integers and print_to_file.
@kristoffer.carlsson has already found a factor of two improvement for the integer parsing library code (https://github.com/JuliaLang/julia/pull/28661) which should get parse_integer back down to where it was or better. If thereās a similar fix for printing ints then the 1.0.x microbenchmarks will show slight improvement over 0.6 overall. Of course the microbenchmarks are in no way a representative sample of real-world code.
0.6.4 | 0.7.0 | 1.0.0 | |
---|---|---|---|
iteration_pi_sum | 27.37 | 27.67 | 27.66 |
matrix_multiply | 70.24 | 70.22 | 70.32 |
matrix_statistics | 8.513 | 7.286 | 7.323 |
parse_integers | 0.132 | 0.221 | 0.218 |
print_to_file | 6.860 | 10.833 | 10.870 |
recursion_fibonacci | 0.0406 | 0.0302 | 0.0302 |
recursion_quicksort | 0.248 | 0.261 | 0.259 |
userfunc_mandelbrot | 0.0565 | 0.0527 | 0.0527 |
Nice.
I can see itās quite stable except strangely for parse_integers and print_to_file that now need double time.
Wow. Thatās incredibly consistent especially considering that the optimizer was completely rewritten.
Of course it would be very interesting to know what happened to parsing and printing.
https://github.com/JuliaLang/julia/pull/28670 should improve print_to_file
as well.
Thatās exactly what I meant in the first comment in this thread, these micro benchmarks donāt reflect the actual improvements that have been made. In my real-world large codes I see about 20 - 50% improvement moving from 0.6.4 to 1.0. Iām still happy though because the most important test in my opinion, matrix_statistics
, got improved. That said, this specific benchmark tests looping performance rather than matrix statistics, I didnāt see one doing statistics on a tiny 5-by-5 matrix before, choosing a medium-sized, more practical matrix would be fair.
Is there any prediction on how fast can Julia be compared to C in the future?
I mean theoretical limits due to the way in manages data and garbage and access memory.
What areas can be improved?
What areas are already state-of-the-art?
Itās as fast as C if you donāt trigger the garbage collector. Itās common to pre-allocate memory, or just use stack memory, so that it doesnāt get triggered in the most sensitive parts of your code. A really cool example I saw recently, for small dimensional optimization problems:
Doesnāt allocate any memory, and runs incredibly fast. For small dimensional problems, youād be hard pressed to find anything faster.
I think Julia in practice will often be faster than C, because it is easier to specialize code for a given problem.
Given two libraries, one written in C, and one in Julia, I wouldnāt bet on the C library being faster.
As an example, here are two libraries by the same author (someone known for writing high performance software Steven G. Johnson - Wikipedia):
GitHub - JuliaMath/Cubature.jl: One- and multi-dimensional adaptive integration routines for the Julia language # Written in C, has the advantage that it also offers p-cubature
GitHub - JuliaMath/HCubature.jl: pure-Julia multidimensional h-adaptive integration # Written in pure Julia
# session started with -O3 --depwarn=no
julia> using HCubature, Cubature, StaticArrays, BenchmarkTools
julia> f(x) = exp(-x' * x/2)/2
f (generic function with 1 method)
julia> @btime HCubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), rtol=1e-8)
2.517 ms (63938 allocations: 1.70 MiB)
(3.1415926534311005, 3.141588672705139e-8)
julia> @btime Cubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), reltol=1e-8)
5.425 ms (193752 allocations: 8.28 MiB)
(3.1415926534311027, 3.141588673692509e-8)
julia> @btime HCubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), rtol=1e-12) # Julia
62.269 ms (1448586 allocations: 36.86 MiB)
(3.1415926535897993, 3.1233157511412803e-12)
julia> @btime Cubature.hcubature(f, SVector(-20.,-20.), SVector(20.,20.), reltol=1e-12) # C
146.159 ms (4315742 allocations: 184.39 MiB)
(3.1415926535897976, 3.1411238232300217e-12)
Chris Rackauckas also explains the advantages Juliaās late compilation provides here::
I also gave an example of writing optimized kernels in Julia using SIMD intrinsics in pure Julia here: matmul post . At the end, I compared multiplying two 200x200 matrices with that Julia code with OpenBLAS, which has kernels written in assembly: Skylake-X OpenBLAS kernel. Julia took 147.202 Ī¼s
, OpenBLAS took 335.363
. (To be fair, some of that difference was overhead, that I skipped in Julia by taking care of all that at compile time ā but that again presents an advantage of Juliaās late compilation in practice).
Itās possible to write slow code in any language, but itās also definitely possible to write among the fastest code in Julia. More than that, the fastest generic code for libraries aimed at end users whoāre going to try and do who-knows-what?
Do you think we wil see OpenBlas, MKL and similar libraries written completly in Julia?