Why Julia is so much slower than Python? Invitation to Euler Project

Btw. out of curiosity I replayed your findings and cannot reproduce. You most likely messed up the benchmarking (I’d recommend BenchmarkTools).

Your original claiming that with your implementation Python is 52x faster is simply wrong. Julia is slightly faster.

You also wrote that the “short function definition” is faster. That cannot be: the way a function is defined is not relevant for the performance. (There were some problems back in Julia 0.3 and earlier regarding anonymous functions but that’s fixed a long time ago)

My advice for everyone is “Don’t feed the trolls”. I am out and unsubscribing from this thread.

3 Likes

I like Julia syntax and the fact that it actually uses “type hints” (using Python terminology) to run the code faster (which Python does not and will not be doing in a nearest future, except for the numba package that you have also mentioned). I also like that in Julia the programmer do not constrained by the necessity to use vectorization, etc.

However, so far I have not seen an example when carefully written Python code runs essentially worse than the first run of the Julia code (I mean the compilation time included).

That is why, when I have time, I try to solve the problem both in Python and in Julia and compare the execution time.

Nice to hear.

To follow this up a bit. The design of the language is such that one can easily add specialized methods that are tailored to different types. The basic algorithm for getting digits from a machine integer is very different from the one for getting digits from a Bigint.

What part of the language design are you referring to? It seems that multiple dispatch is a language paradigm that is particularly suitable for solving issues like missing specializations.

Please understand that you are making claims about Julia, which is designed to be a programming language with fast execution times if used idiomatically, without a readily available code example.

I don’t think you are trolling this forum, but it is unclear what your goals are here. If you want optimization advice, just please post an MWE here, without asking people to chase your code down.

If Julia indeed turns out to be slower than it needs to be due to some inherent design flaw, then of course you are welcome to discuss that here. But without actual code this discussion is a bit unfocused, so I am not surprised that it borders on trolling for some readers.

9 Likes

I also think that it should not be the case. I spotted the difference when changed from a longer function defitinion to the short one. But just now I have tried and cannot reproduce it. So, I accept that it is not true and will delete the corresponding post.

I will change the code enough to not violate the Project Euler rules if I will ever post a question that relates to its problems in the future.

As to this thread, everything is already clear:

  1. The digits function should work better with BigInts.
  2. Those, who want to troll, are not welcomed.
    That’s all. :slight_smile:

I’d just like to note that this isn’t true. Functions that use “”“type hints”“” are not faster. Type stable functions are always equally fast regardless of whether they are strictly typed or not. Typing function arguments only affects dispatch (it is essentially the enabling principle for multiple dispatch). I.e. it allows you to specialize the same function name on Int vs BigInt.

4 Likes

The type annotations actually don’t affect performance, you get automatic specialization without them.

2 Likes

I don’t think you can make such a claim about the digits function. There is maybe room for improvement but currently I see no evidence that it’s “slow”, since speed is always relative and we there is no comparison mentioned in this thread. The only speed comparison of the digits function is implicitly given by the two implementations in Python and Julia, one using string-iteration, the other using digits and Julia (with digits) is faster.

Also I’d like to add

  1. The original claim that Julia is 52x slower is wrong. The best explanation is that you forgot to run the code with n=10 and thus measured the compilation time with n=100.
1 Like

I just read your edit that you “estimate” on the "toxic"ity of the Julia community is 75%. The first post which changed your tone was from @mschauer Why Julia is so much slower than Python? Invitation to Euler Project - #8 by mschauer who simply pointed out that (let me summarise) a bold statement like “Julia is 52x slower than Python” is not a nice starter for a discussion without a single line of code.

Moreover, we are talking about a single line of code and it would have been much more helpful to simply post that single line of this hello-world level problem.

To me your post sounded like “that guy told me to do that and that to be convinced that Julia is faster than Python, I did that (register if you want to see one line of code) and the outcome is that Python is superiour”.

Is this not a direct invitation to conflicts and “toxicity”?

Instead, you could have started a discussion without referring to Project Euler at all and ask why this specific line of code is faster in Python, since the Project Euler rules are there to prevent open discussions with the title “How to solve Problem 20?”. Now you have every ingredient here including solutions, so Google will make it easy for others if they happen to search for a Julian solution to this problem.

Kind of the opposite of what the rules wanted to achieve, that’s why it’s important to know why they’re there.

6 Likes

I kind of like this line of argument. We can just throw it everywhere and get fun results.

Python has loops built into the language which… therefore it’s not a sound programming language design.

R has NA built into the language which slows down most calculations, therefore it’s not a sound programming language design.

Java has a reliance on heap-allocated objects built into the language which manifests in more GC usage, and therefore it’s not a sound programming language design.

I mean there are all arguments people have made, and will keep making, but to write off every single language because they all have faults is a little far. I think it’s better to just say, the digits function should get a specialization (that probably only takes like 2 lines of Julia code, so let’s add it to Julia v1.6 and be done with this), Python usage is as glue code between languages with fast loops, NA has trade-offs which can be good for statistical analysis, and OOP can be convenient with a GC, and everything can be made better.

Anyways, someone should just open a PR with whatever digit overload fixes this.

10 Likes

Sometimes generic functions (i.e., designed for a wide variety of types) end up being slow for some specific types, where faster alternatives exist.

As far as I understand, in the C gmp library (which julia uses of BigInt) the way to get the digit is via a mpz_get_str function, which return a list of characters (that can be interpred as a string, or as a list of digits).

As a consequence, [parse(Int, c) for c in string(b)] is very close to being the most efficient way to get a vector of digits. Note that while “parsing” may seem like a slow thing to do in general, parsing a char into an int is very fast, as it is a simple subtraction of the char value of '0'.

Note that Julia also lets you “dig” into the internals of C libraries with the really cool @ccall macro, so if you are really interested in speed (note: don’t really do this, the following is a bit on the unsafe side of things), you can call the C functions directly.

For example:

julia> using Base.GMP: MPZ

julia> b = big"123132131212312312";

julia> s = Vector{UInt8}(undef, ndigits(b));

julia> ptr = pointer(s);

julia> @ccall glib.__gmpz_get_str(ptr::Ptr{Cchar}, 10::Cint, b::MPZ.mpz_t)::Ptr{Cchar}
Ptr{Int8} @0x00007fadc9a29630

julia> Char.(s) .- '0' # take offset from `0` to convert chars to ints
18-element Array{Int64,1}:
 1
 2
 3
 1
 3
 2
 1
 3
 1
 2
 1
 2
 3
 1
 2
 3
 1
 2
4 Likes

No, and now I can reproduce it. Namely, running the following code

factorial_digit_sum(n::Int64) = n |> big |> factorial |> digits |> sum

n = 10
@time result = factorial_digit_sum(n)
@show result

n = 100
@time result = factorial_digit_sum(n)
@show result

for the first time in a terminal after computer reboot with the command “julia program_file.jl” produces the following output:

> 0.538902 seconds (184.37 k allocations: 9.549 MiB)
> result = 27
> 0.009945 seconds (1.42 k allocations: 30.969 KiB)
> result = 648

So, the main function was first run with n=10 and only then with n=100. However, the time it took for n=100 is exactly 9945 microseconds, what is about 52 times more than the corresponding code in Python (without the digits function, of course), which on the first run reports the following:

The sum of digits in 100! equals 648.
It took 0.0002079010009765625 seconds to get it.

The version of my Julia is 1.0.3.

You can’t benchmark effectively with @time.

This is using Julia 1.0.1

julia> factorial_digit_sum(n::Int64) = n |> big |> factorial |> digits |> sum;

julia> using BenchmarkTools

julia> @btime factorial_digit_sum(100)
  80.314 μs (1415 allocations: 30.80 KiB)
648

On an unrelated note, you should update your julia version :wink:

4 Likes

Could you please provide a fully reproducable example of this, with timing using BenchmarkTools.jl?
If this effect still persists, it is worth creating an issue.

It looks like the main reason for slowdown here is a call right after compilation, and Julia chose to do a GC sweep during that particular call.

Benchmarking shows the following:

julia> factorial_digit_sum(n::Int64) = n |> big |> factorial |> digits |> sum
factorial_digit_sum (generic function with 1 method)

julia> n = 100
100

julia> @benchmark factorial_digit_sum(n)
BenchmarkTools.Trial: 
  memory estimate:  30.80 KiB
  allocs estimate:  1415
  --------------
  minimum time:     47.167 μs (0.00% GC)
  median time:      54.298 μs (0.00% GC)
  mean time:        105.666 μs (30.69% GC)
  maximum time:     122.780 ms (77.39% GC)
  --------------
  samples:          10000
  evals/sample:     1

So, you could have the timing that’s dozen times worse than you show, but it’s much faster on average.

2 Likes

No, I have misinterpreted it. The issue arises as discribed here, but does not depend on the form of the function definition.

The @btime from BenchmarkTools states that fuctorial_digit_sum(100) executes 135.422 μs on my old computer.

I do have Julia 1.4 on my other Gentoo box, but I try to not use it during the summer. :slight_smile:

As to updating Julia on Gentoo, I will probably start another thread later, as I and Gentoo devs do have something to say about it. :slight_smile: