Julia needs work at "Benchmark Game": numbers seem off up to 100x slower, maybe "Julia AOT" entries needed?


#1

There’s another open thread on it (post most questions at Julia programs now shown on benchmarks game website), I post separately for an overview, and to discuss AOT, as in https://github.com/JuliaLang/PackageCompiler.jl (?) as are the load numbers off, showing startup-cost counted?

A.

Ruby used to be slow, but PHP at least is supposed to be fast these days, and being behind those and Python 3 in cases seems wrong:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/knucleotide.html
1.0 C++ g++ #2 3.83 156,104 1624 12.00 72% 73% 98% 72%
[…]
4.5 Lisp SBCL #6 17.37 541,992 2479 62.93 89% 88% 87% 100%
[…]
21 Python 3 #3 79.79 250,948 1967 5 min 98% 96% 96% 99%
[…]
81 Matz’s Ruby 5 min 126,652 637 16 min 95% 68% 84% 81%
[…]
100 Julia 6 min 1,584,956 870 6 min 2% 32% 56% 15%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/mandelbrot.html
3.3 Go #4 5.47 31,040 905 21.73 99% 99% 100% 99%
[…]
27 Julia 44.10 179,004 483 44.41 99% 0% 0% 1%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html
1.0 C++ g++ #9 3.67 118,620 809 11.91 75% 78% 99% 76%
1.0 C gcc #3 3.72 117,408 836 11.80 76% 95% 75% 72%
[…]
4.1 Chapel #2 15.10 361,908 470 47.92 100% 56% 73% 92%
4.2 Fortran Intel #2 15.35 101,984 1148 15.34 0% 0% 1% 100%
[…]
7.6 Dart 27.97 896,244 457 31.46 8% 20% 17% 71%
[…]
8.6 VW Smalltalk #3 31.71 375,520 959 79.03 55% 57% 72% 67%
[…]
17 JRuby #5 60.98 2,336,824 1083 224.55 92% 92% 95% 91%
[…]
18 PHP #6 66.56 736,600 868 244.42 94% 92% 91% 96%
21 VW Smalltalk 76.53 375,276 744 76.44 42% 0% 0% 58%
[…]
25 Python 3 92.72 448,844 589 5 min 87% 90% 96% 87%
26 Julia 96.71 710,224 436 96.91 0% 0% 0% 100%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/spectralnorm-julia-1.html
1.0 Rust #6 1.97 2,600 1126 7.86 100% 100% 100% 99%
1.0 C gcc #4 1.98 1,160 1139 7.86 99% 99% 99% 99%
1.0 Fortran Intel #3 1.98 1,684 638 7.89
[…]
2.0 Lisp SBCL #5 4.00 19,368 899 15.73 99% 99% 99% 99%
2.1 Haskell GHC #4 4.17 3,812 987 15.74 99% 97% 99% 98%
[…]
8.2 Dart #5 16.16 120,132 489 16.64 2% 2% 99% 2%
8.2 Julia 16.20 146,932 440 16.49 1% 1% 100% 1%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/fannkuchredux.html
1.0 C gcc #5 8.72 916 910 34.23 100% 100% 95% 99%
[…]
2.0 Go 17.83 1,480 900 71.04 100% 100% 100% 100%
[…]
2.1 Java 17.91 31,560 1282 70.25 99% 98% 99% 97%
[…]
3.0 Java AOT 25.93 8,488 1282 103.26 99% 100% 100% 100%
[…]
6.2 Julia 54.40 148,160 565 54.59 100% 0% 0% 1%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/regexredux.html
1.8 PHP 2.63 270,048 816 2.46 38% 39% 85% 38%
[…]
4.0 Julia 5.88 349,980 541 6.16 2% 3% 100% 2%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/how-programs-are-measured.html

How source code size is measured

[…] remove duplicate whitespace characters

I little unfair, as Python relies of for correctness (not just readble indentation).

[I would like to see Julia in the list given, to see how it stacks up, but I guess it then needs an entry for all programs.]

How CPU load is measured

The GTop cpu idle and GTop cpu total are taken before forking the child-process and after the child-process exits. The percentages represent the proportion of cpu not-idle to cpu total for each core.

On win32: GetSystemTimes UserTime IdleTime are taken before forking the child-process and after the child-process exits. The percentage represents the proportion of TotalUserTime to UserTime + IdleTime (because that’s like the percentage you’ll see in Task Manager).

B.
Some at: https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/julia.html

have “Bad Output” (and why only compared there to Python 3"?

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/revcomp-julia-1.html

UNEXPECTED OUTPUT

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/fasta.html


Julia programs now shown on benchmarks game website
#2

As have already been said, the current benchmarks at that site have not at all been optimized for speed. Some faster versions (and also fixes for the ones that contain bad output is available at https://github.com/KristofferC/BenchmarksGame.jl).


#3

When will we see your improved benchmarks on the BenchmarksGame website?


#4

Take it easy, they only just got up there at all. Anyone who’s interested in making them faster should spend some time with https://github.com/KristofferC/BenchmarksGame.jl, get it in good shape and then the benchmarks on the site can be updated.


#5

Sorry, didn’t mean to sound impatient, could have phrased that better. I’m more just curious about how easy it will be for us to update the benchmarks in the future. ie. ‘if we improve a benchmark does it take hours to get it put up or months?’.

I had hoped that there’d be a repo we could just push to and have those changed benchmarks be reflected on the website once it re-runs the benchmarks but I have no idea if that’s accurate.

I’m guessing the plan is to collect all the improved benchmarks and make one big change instead of many small changes?


#6

I personally have no idea on how the submission process is made for the BenchmarksGame website.


#7

I reordered by the cpu column (the last time column), to show more clearly, how these same languages/best implementations would rank if none of the entries where multi-threaded (or where we could be with, just that change, assuming linear speed-up), i.e. then we beat Ruby in the first benchmark…:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/knucleotide.html
1.0 C++ g++ #2 3.83 156,104 1624 12.00 72% 73% 98% 72%
4.5 Lisp SBCL #6 17.37 541,992 2479 62.93 89% 88% 87% 100%
21 Python 3 #3 79.79 250,948 1967 5 min 98% 96% 96% 99%

100 Julia 6 min 1,584,956 870 6 min 2% 32% 56% 15%
81 Matz’s Ruby 5 min 126,652 637 16 min 95% 68% 84% 81%

Go’s advantage is much less, about 100$ faster:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/mandelbrot.html
3.3 Go #4 5.47 31,040 905 21.73 99% 99% 100% 99%

27 Julia 44.10 179,004 483 44.41 99% 0% 0% 1%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html
1.0 C gcc #3 3.72 117,408 836 11.80 76% 95% 75% 72%
1.0 C++ g++ #9 3.67 118,620 809 11.91 75% 78% 99% 76%

4.2 Fortran Intel #2 15.35 101,984 1148 15.34 0% 0% 1% 100%
7.6 Dart 27.97 896,244 457 31.46 8% 20% 17% 71%
4.1 Chapel #2 15.10 361,908 470 47.92 100% 56% 73% 92%
21 VW Smalltalk 76.53 375,276 744 76.44 42% 0% 0% 58%
8.6 VW Smalltalk #3 31.71 375,520 959 79.03 55% 57% 72% 67%

26 Julia 96.71 710,224 436 96.91 0% 0% 0% 100%
17 JRuby #5 60.98 2,336,824 1083 224.55 92% 92% 95% 91%
18 PHP #6 66.56 736,600 868 244.42 94% 92% 91% 96%
25 Python 3 92.72 448,844 589 5 min 87% 90% 96% 87%

The low-hanging fruit seems to involve multi-threading.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/spectralnorm-julia-1.html
1.0 Rust #6 1.97 2,600 1126 7.86 100% 100% 100% 99%
1.0 C gcc #4 1.98 1,160 1139 7.86 99% 99% 99% 99%
1.0 Fortran Intel #3 1.98 1,684 638 7.89
2.0 Lisp SBCL #5 4.00 19,368 899 15.73 99% 99% 99% 99%
2.1 Haskell GHC #4 4.17 3,812 987 15.74 99% 97% 99% 98%

8.2 Julia 16.20 146,932 440 16.49 1% 1% 100% 1%
8.2 Dart #5 16.16 120,132 489 16.64 2% 2% 99% 2%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/fannkuchredux.html
1.0 C gcc #5 8.72 916 910 34.23 100% 100% 95% 99%

6.2 Julia 54.40 148,160 565 54.59 100% 0% 0% 1%
2.1 Java 17.91 31,560 1282 70.25 99% 98% 99% 97%
2.0 Go 17.83 1,480 900 71.04 100% 100% 100% 100%
3.0 Java AOT 25.93 8,488 1282 103.26 99% 100% 100% 100%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/regexredux.html
1.8 PHP 2.63 270,048 816 2.46 38% 39% 85% 38%
[…]
4.0 Julia 5.88 349,980 541 6.16 2% 3% 100% 2%


#8

I see no point in looking at the current benchmark numbers since, again, they are not optimized. Also, many of the new entries in https://github.com/KristofferC/BenchmarksGame.jl do use multithreading.


#9

Well, they are up on the website, so I guess the “regulars” around here may have to tolerate people coming here from the benchmark game’s website with a :confused: look on their face :slight_smile:


#10

Ok, rephrasing, there is no point in analyzing the current benchmark numbers more than saying “they are slow right now”.


#11

fyi https://salsa.debian.org/benchmarksgame-team/benchmarksgame/blob/master/CONTRIBUTING.md