Julia needs work at "Benchmark Game": numbers seem off up to 100x slower, maybe "Julia AOT" entries needed?

Palli · November 23, 2018, 4:39pm

There’s another open thread on it (post most questions at Julia programs now shown on benchmarks game website), I post separately for an overview, and to discuss AOT, as in GitHub - JuliaLang/PackageCompiler.jl: Compile your Julia Package (?) as are the load numbers off, showing startup-cost counted?

A.

Ruby used to be slow, but PHP at least is supposed to be fast these days, and being behind those and Python 3 in cases seems wrong:

1.0 C++ g++ #2 3.83 156,104 1624 12.00 72% 73% 98% 72%
[…]
4.5 Lisp SBCL #6 17.37 541,992 2479 62.93 89% 88% 87% 100%
[…]
21 Python 3 #3 79.79 250,948 1967 5 min 98% 96% 96% 99%
[…]
81 Matz’s Ruby 5 min 126,652 637 16 min 95% 68% 84% 81%
[…]
100 Julia 6 min 1,584,956 870 6 min 2% 32% 56% 15%

3.3 Go #4 5.47 31,040 905 21.73 99% 99% 100% 99%
[…]
27 Julia 44.10 179,004 483 44.41 99% 0% 0% 1%

1.0 C++ g++ #9 3.67 118,620 809 11.91 75% 78% 99% 76%
1.0 C gcc #3 3.72 117,408 836 11.80 76% 95% 75% 72%
[…]
4.1 Chapel #2 15.10 361,908 470 47.92 100% 56% 73% 92%
4.2 Fortran Intel #2 15.35 101,984 1148 15.34 0% 0% 1% 100%
[…]
7.6 Dart 27.97 896,244 457 31.46 8% 20% 17% 71%
[…]
8.6 VW Smalltalk #3 31.71 375,520 959 79.03 55% 57% 72% 67%
[…]
17 JRuby #5 60.98 2,336,824 1083 224.55 92% 92% 95% 91%
[…]
18 PHP #6 66.56 736,600 868 244.42 94% 92% 91% 96%
21 VW Smalltalk 76.53 375,276 744 76.44 42% 0% 0% 58%
[…]
25 Python 3 92.72 448,844 589 5 min 87% 90% 96% 87%
26 Julia 96.71 710,224 436 96.91 0% 0% 0% 100%

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/spectralnorm-julia-1.html
1.0 Rust #6 1.97 2,600 1126 7.86 100% 100% 100% 99%
1.0 C gcc #4 1.98 1,160 1139 7.86 99% 99% 99% 99%
1.0 Fortran Intel #3 1.98 1,684 638 7.89
[…]
2.0 Lisp SBCL #5 4.00 19,368 899 15.73 99% 99% 99% 99%
2.1 Haskell GHC #4 4.17 3,812 987 15.74 99% 97% 99% 98%
[…]
8.2 Dart #5 16.16 120,132 489 16.64 2% 2% 99% 2%
8.2 Julia 16.20 146,932 440 16.49 1% 1% 100% 1%

1.0 C gcc #5 8.72 916 910 34.23 100% 100% 95% 99%
[…]
2.0 Go 17.83 1,480 900 71.04 100% 100% 100% 100%
[…]
2.1 Java 17.91 31,560 1282 70.25 99% 98% 99% 97%
[…]
3.0 Java AOT 25.93 8,488 1282 103.26 99% 100% 100% 100%
[…]
6.2 Julia 54.40 148,160 565 54.59 100% 0% 0% 1%

1.8 PHP 2.63 270,048 816 2.46 38% 39% 85% 38%
[…]
4.0 Julia 5.88 349,980 541 6.16 2% 3% 100% 2%

How source code size is measured

[…] remove duplicate whitespace characters

I little unfair, as Python relies of for correctness (not just readble indentation).

[I would like to see Julia in the list given, to see how it stacks up, but I guess it then needs an entry for all programs.]

How CPU load is measured

The GTop cpu idle and GTop cpu total are taken before forking the child-process and after the child-process exits. The percentages represent the proportion of cpu not-idle to cpu total for each core.

On win32: GetSystemTimes UserTime IdleTime are taken before forking the child-process and after the child-process exits. The percentage represents the proportion of TotalUserTime to UserTime + IdleTime (because that’s like the percentage you’ll see in Task Manager).

B.
Some at: https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/julia.html

have “Bad Output” (and why only compared there to Python 3"?

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/revcomp-julia-1.html

UNEXPECTED OUTPUT

kristoffer.carlsson · November 23, 2018, 4:50pm

As have already been said, the current benchmarks at that site have not at all been optimized for speed. Some faster versions (and also fixes for the ones that contain bad output is available at https://github.com/KristofferC/BenchmarksGame.jl).

Mason · November 23, 2018, 5:31pm

When will we see your improved benchmarks on the BenchmarksGame website?

StefanKarpinski · November 23, 2018, 5:33pm

Take it easy, they only just got up there at all. Anyone who’s interested in making them faster should spend some time with https://github.com/KristofferC/BenchmarksGame.jl, get it in good shape and then the benchmarks on the site can be updated.

Mason · November 23, 2018, 5:37pm

Sorry, didn’t mean to sound impatient, could have phrased that better. I’m more just curious about how easy it will be for us to update the benchmarks in the future. ie. ‘if we improve a benchmark does it take hours to get it put up or months?’.

I had hoped that there’d be a repo we could just push to and have those changed benchmarks be reflected on the website once it re-runs the benchmarks but I have no idea if that’s accurate.

I’m guessing the plan is to collect all the improved benchmarks and make one big change instead of many small changes?

kristoffer.carlsson · November 23, 2018, 5:39pm

I personally have no idea on how the submission process is made for the BenchmarksGame website.

Palli · November 23, 2018, 5:40pm

I reordered by the cpu column (the last time column), to show more clearly, how these same languages/best implementations would rank if none of the entries where multi-threaded (or where we could be with, just that change, assuming linear speed-up), i.e. then we beat Ruby in the first benchmark…:

1.0 C++ g++ #2 3.83 156,104 1624 12.00 72% 73% 98% 72%
4.5 Lisp SBCL #6 17.37 541,992 2479 62.93 89% 88% 87% 100%
21 Python 3 #3 79.79 250,948 1967 5 min 98% 96% 96% 99%

100 Julia 6 min 1,584,956 870 6 min 2% 32% 56% 15%
81 Matz’s Ruby 5 min 126,652 637 16 min 95% 68% 84% 81%

Go’s advantage is much less, about 100$ faster:

3.3 Go #4 5.47 31,040 905 21.73 99% 99% 100% 99%

27 Julia 44.10 179,004 483 44.41 99% 0% 0% 1%

1.0 C gcc #3 3.72 117,408 836 11.80 76% 95% 75% 72%
1.0 C++ g++ #9 3.67 118,620 809 11.91 75% 78% 99% 76%

4.2 Fortran Intel #2 15.35 101,984 1148 15.34 0% 0% 1% 100%
7.6 Dart 27.97 896,244 457 31.46 8% 20% 17% 71%
4.1 Chapel #2 15.10 361,908 470 47.92 100% 56% 73% 92%
21 VW Smalltalk 76.53 375,276 744 76.44 42% 0% 0% 58%
8.6 VW Smalltalk #3 31.71 375,520 959 79.03 55% 57% 72% 67%

26 Julia 96.71 710,224 436 96.91 0% 0% 0% 100%
17 JRuby #5 60.98 2,336,824 1083 224.55 92% 92% 95% 91%
18 PHP #6 66.56 736,600 868 244.42 94% 92% 91% 96%
25 Python 3 92.72 448,844 589 5 min 87% 90% 96% 87%

The low-hanging fruit seems to involve multi-threading.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/spectralnorm-julia-1.html
1.0 Rust #6 1.97 2,600 1126 7.86 100% 100% 100% 99%
1.0 C gcc #4 1.98 1,160 1139 7.86 99% 99% 99% 99%
1.0 Fortran Intel #3 1.98 1,684 638 7.89
2.0 Lisp SBCL #5 4.00 19,368 899 15.73 99% 99% 99% 99%
2.1 Haskell GHC #4 4.17 3,812 987 15.74 99% 97% 99% 98%

8.2 Julia 16.20 146,932 440 16.49 1% 1% 100% 1%
8.2 Dart #5 16.16 120,132 489 16.64 2% 2% 99% 2%

1.0 C gcc #5 8.72 916 910 34.23 100% 100% 95% 99%

6.2 Julia 54.40 148,160 565 54.59 100% 0% 0% 1%
2.1 Java 17.91 31,560 1282 70.25 99% 98% 99% 97%
2.0 Go 17.83 1,480 900 71.04 100% 100% 100% 100%
3.0 Java AOT 25.93 8,488 1282 103.26 99% 100% 100% 100%

1.8 PHP 2.63 270,048 816 2.46 38% 39% 85% 38%
[…]
4.0 Julia 5.88 349,980 541 6.16 2% 3% 100% 2%

kristoffer.carlsson · November 23, 2018, 5:41pm

I see no point in looking at the current benchmark numbers since, again, they are not optimized. Also, many of the new entries in https://github.com/KristofferC/BenchmarksGame.jl do use multithreading.

pkofod · November 23, 2018, 6:28pm

Well, they are up on the website, so I guess the “regulars” around here may have to tolerate people coming here from the benchmark game’s website with a look on their face

kristoffer.carlsson · November 23, 2018, 6:29pm

Ok, rephrasing, there is no point in analyzing the current benchmark numbers more than saying “they are slow right now”.

igouy · November 23, 2018, 6:30pm

fyi CONTRIBUTING.md · master · The Computer Language Benchmarks Game / benchmarksgame · GitLab

Topic		Replies	Views
Benchmark game challenge and some optimization questions Performance	29	2807	January 13, 2024
Benchmarks game Performance	20	3795	May 13, 2020
Julia programs now shown on benchmarks game website Community announcement	144	13742	December 3, 2019
Does Debian's BenchmarkGames show representative performance? Community benchmark	40	2885	August 18, 2022
Julia position in the Debian Benchmark Game can be improved, and categorization of some Julia there is unfair Performance	29	1903	December 12, 2024

Related topics