Comparison of languages for parallel computing tasks

PetrKryslUCSD · September 17, 2020, 10:13pm

The paper
https://www.sciencedirect.com/science/article/abs/pii/S2210650220303734
https://lilloa.univ-lille.fr/handle/20.500.12210/29528
compares Chapel, Python, and Julia. For reasons that are at this point not crisp to me
Julia has not performed well.

stevengj · September 17, 2020, 10:25pm

~~They don’t provide any of the code that they used for the comparison, making it pretty hard to evaluate this article independently.~~ Here is the code on Zenodo, and here is their github repo.

PetrKryslUCSD · September 17, 2020, 10:41pm

There is a zenodo link to the complete codes.

jling · September 17, 2020, 11:03pm

https://github.com/jangmys/BenchParallelMeta/blob/d356be83e1f1e7c27052456e8300f5982db1b6b6/Q3AP-ILS/Julia/ils_q3ap_par.jl#L129

looks like there’s no warm-up

giordano · September 17, 2020, 11:05pm

Q3AP-ILS unmodified:

% julia -t 4 ./ils_q3ap_par.jl nug12 100
Time (Init instance):   1.764211577
Time (First LS/Compilation):    0.230459977
======== ITERATED LOCAL SEARCH ======== 100
        Best solution

Solution([5, 4, 7, 6, 10, 11, 12, 9, 2, 1, 3, 8], [11, 7, 6, 2, 12, 4, 1, 9, 8, 5, 10, 3], 658)

        TotalTime:      3.390096732
        ILSTime:        1.395425178
        Nhood evals:    533
        NhoodPerSec:    381.96243582470316

with a bunch of const to avoid using global non-constant variables:

% julia -t 4 ./ils_q3ap_par.jl nug12 100
Time (Init instance):   1.796165105
Time (First LS/Compilation):    0.202069824
WARNING: redefinition of constant sol. This may fail, cause incorrect answers, or produce other errors.
======== ITERATED LOCAL SEARCH ======== 100
WARNING: redefinition of constant sol. This may fail, cause incorrect answers, or produce other errors.
        Best solution

Solution([6, 10, 9, 3, 7, 2, 12, 1, 8, 11, 4, 5], [6, 7, 4, 5, 2, 9, 11, 10, 8, 3, 12, 1], 778)

        TotalTime:      3.181146955
        ILSTime:        1.182912026
        Nhood evals:    528
        NhoodPerSec:    446.3561012101842

Running all the @elapsed ... twice to not measure compilation:

% julia -t 4 ./ils_q3ap_par.jl nug12 100
Time (Init instance):   0.065759911
Time (First LS/Compilation):    0.001595556
WARNING: redefinition of constant sol. This may fail, cause incorrect answers, or produce other errors.
======== ITERATED LOCAL SEARCH ======== 100
WARNING: redefinition of constant sol. This may fail, cause incorrect answers, or produce other errors.
======== ITERATED LOCAL SEARCH ======== 100
WARNING: redefinition of constant sol. This may fail, cause incorrect answers, or produce other errors.
        Best solution

Solution([4, 9, 2, 8, 5, 6, 7, 3, 1, 10, 11, 12], [8, 4, 1, 12, 5, 11, 10, 3, 6, 7, 2, 9], 752)

        TotalTime:      1.141571685
        ILSTime:        1.074216218
        Nhood evals:    544
        NhoodPerSec:    506.41573910774827

NhoodPerSec increases by more than 32% with two trivial changes

ToucheSir · September 18, 2020, 12:33am

Are/have any of the authors been active here? In particular, it would be great to get some clarification on these statements:

Consequently, much of the information one can find in online forums and documentation is no longer valid

There is a large amount of documentation available, but it sometimes feels opaque—for instance we were unable to find information on the thread layer used for the multi-threading package.

Also, using globals is mentioned as a performance trap so often that not accounting for them and then claiming poor performance seems suspect. If nothing else, it makes this section sound like a cop-out:

In our case, both programmers have strong prior experience with C and parallel computing and little to intermediate prior knowledge of Python, Julia and Chapel. As detailed in Section 4, we have followed a protocol that aims at making the comparison fair. However, we cannot completely exclude that some parts of the code could be written more efficiently or concisely.

giordano · September 18, 2020, 12:37am

To put into context, the paper was submitted on 15 October 2019, which is almost one year ago.

jling · September 18, 2020, 12:45am

I can’t help but feel if they know to use Numpy and even Numba for Python (at which point it’s no longer benchmarking python anymore), at least they should know to time things properly? (with worm up etc.), and read the performance tips Manual page once?

For example, I won’t expect them to know to use @SVector, as a comparison. But come on, if the paper is about comparing* performance, at least get that little bit right.

PetrKryslUCSD · September 18, 2020, 1:14am

Concerning warmup: If the Julia code is cold-started, then it would be fair
to include in the time comparison also the compile/link times
for Chapel and C/OpenMP, wouldn’t it?

ImreSamu · September 18, 2020, 1:32am

As in the paper :

Julia 1.2
page36: "Numba’s and Julia’s (experimental) multi-threading support is not mature
enough to compete with OpenMP or Chapel in terms of scalability."

PetrKryslUCSD · September 18, 2020, 2:39am

True for Julia at the time. It might be interesting to re-run with 1.6.

johnh · September 21, 2020, 9:11am

I am going to show some prejudice here. Chapel has been around for a long time. It is associated with Cray of course. I do not know if it will ever break out beyond that.
I can be very wrong - I am not plugged into that community.

Topic		Replies	Views
Chapel vs Julia (and compared to Python/Numba, and C+OpenMP) Offtopic	3	2007	November 19, 2020
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java Numerics hpc	7	539	July 8, 2023
Julia vs. Chapel performance General Usage performance	2	2061	May 20, 2019
Where does Julia (ecosystem) provide the greatest speedup, and where does it lag the most behind (compared to e.g. Python)? Community	44	5699	March 10, 2021
Optimized Python is as good as Julia Performance question	28	25287	June 2, 2025

Comparison of languages for parallel computing tasks

Related topics