General questions from Python user

on purpose

2 Likes

:man_shrugging: assuming people purposefully use global scope to perform unrealistic task just to show Julia is slow, I guess this demo is applicable.

I just meant to show that if one writes type-unstable code (in this case, on purpose), performance is bad, and probably that is what “naive” may mostly mean in Julia.

1 Like

By replacing zeros with similar (which just allocates the output array without setting it to 0) and adding @inbounds the function gives the same performance as broadcasting:

function f2(a,b)
    y = similar(a)
    @inbounds for i in 1:length(a)
        y[i] = a[i]/b[i]
    end
    y
end

But I agree that this is no longer naive :wink:
Coming from Python / Numpy, broadcasting seemed to be the most natural way to do “vectorized” operations in Julia.

5 Likes

but even in this case julia is still faster:

In [7]: %%timeit
   ...: for i in range(0,len(y)):
   ...:     y[i] = a[i]/b[i]
   ...:
   ...:
229 µs ± 2.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

julia> @btime for i in 1:length(a)
         y[i] = a[i] / b[i]
       end
  97.144 μs (4980 allocations: 93.45 KiB)

you can’t assume user knows to vectorize in numpy but don’t know broadcasting in Julia.

1 Like

Exactly. This is correct. It is easy to write fast Julia code, if one is aware of the basic stuff. But there is an initial set of things to learn and get used to (as expected for everything that does not do magic).

I think we do not disagree in anything.

1 Like

Naive implementation usually also includes slicing without views. Since numpy slice are equivalent to Julia views, direct translation of numpy code can result in huge allocations and as a result very bad performance.

7 Likes

That’s of course correct, but it seems like this case is the most common demonstration of “Julia is slower than Python.” People tend to:

  • benchmark in global scope
  • without running once to compile, and
  • without interpolating their globals with $
  • and often with a type instability

It’s understandable why they do this. They see a neat blog with a quick Jupyter notebook, they download Julia, and want to try a couple things out. It’s probably necessary to read several different parts of the manual to get a proper, non-naive benchmark.

4 Likes

Naive is not a synonyme for plain, nor for evil. Oxford Languages defines it as “showing a lack of experience, wisdom, or judgement”, or “natural and unaffected; innocent”. I would thus define it as having the best intentions, but little expertise.

Some Julia-specific ways of unintentionally shooting oneself in the foot were listed in the comments above. It is easy to get a slow-down by two orders of magnitude. An example of a slow-down by a factor of 70 due to unnecessary allocations was actually cited in my previous post.

3 Likes

respect to a more optimal Julia code, sure. But what about when compared to Python / Numpy? Your last statement was:

but that post was about C++. Now, I think everyone can agree there are more quicks in C++ than Python or Julia.

Now I could probably write a sufficiently fast solution of the cited problem in Python. I wouldn’t. You won, Julia is the best, and inherently better than Python.

For everyone else - see above. Thank you.

straw man fallacy is not fun. That was not my argument at all.

There have been heaps of threads like “Why is my code translated from Python so much slower in Julia?” I don’t understand the point of denying that naive Julia code can be slower than Python.

2 Likes

I’m yet to see a computationally heavy task (i.e. not about julia starting time is slow) that is written in either: both using for loop, or, both in vectorized/broadcasting style; that shows Julia to be much slower than Python/Numpy.

The point is just that if one doesn’t bother to learn Julian idioms and techniques, and instead just ‘writes Python’ in Julia with surface level syntax changes, it’s absolutely quite easy to end up with massive performance problems.

Perhaps the best example would just be tight loops involving global variables, or creating empty, I typed arrays and then pushing to them. Sure, it’s pretty easy to learn how to avoid these problems, but that’s not the point.

7 Likes

for the record, I just want to amend to:

that, in these practices Python will be slow too (due to similar reason, being a dynamic language itself) and very often slower than Julia:

julia> a = rand(10^5);

julia> b = 0;

julia> @btime for x in a
           if x > 0.5
               global b+=x
           end
       end
  6.738 ms (349237 allocations: 6.85 MiB)

In [16]: a = np.random.rand(10**5)

In [17]: b=0

In [18]: %%timeit
    ...: for x in a:
    ...:     global b
    ...:     if x > 0.5:
    ...:         b+=x
    ...:
13 ms ± 325 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

But I think everyone in the thread have something to takeaway and I guess I should stop being annoying.

2 Likes

It can sometimes carry a higher cost in julia however.

If you’d like an example, here’s an example from this thread:

In [6]: def euclidian_algorithm_division_count(a, b):
   ...:     division_count = 1
   ...:     if b > a:
   ...:         a, b = b, a
   ...:     while (c := a % b) != 0:
   ...:         a, b = b, c
   ...:         division_count += 1
   ...:     return division_count
   ...: 
   ...: from random import randint

In [7]: %%timeit
   ...: N = 10**100
   ...: M = 10**4
   ...: division_count_array = []
   ...: while M > 0:
   ...:     a = randint(1, N)
   ...:     b = randint(1, N)
   ...:     division_count_array.append(euclidian_algorithm_division_count(a, b))
   ...:     M -= 1
292 ms ± 7.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
julia> function euclidean_algorithm_division_count(a, b)
           division_count = 1
           if b > a
               a, b = b, a
           end
           while (c = a % b) != 0
               a, b = b, c
               division_count += 1
           end
           return division_count
       end
euclidean_algorithm_division_count (generic function with 1 method)

julia> function main()
           N = big(10)^100
           M = 10^4
           division_count_array = []
           while M > 0
               a, b = rand(1:N, 2)
               push!(division_count_array, euclidean_algorithm_division_count(a, b))
               M -= 1
           end
       end
main (generic function with 1 method)

julia> @btime main()
  378.040 ms (5618922 allocations: 110.55 MiB)

It’s of course not very hard to make the julia version beat the Python version, but this straightforward transcription (that even uses a function) of naive Python code can still be slower in julia.

2 Likes

Thanks for bearing with me! This is a pretty neat pedagogical example!

Ah, looks like a BigInt issue, less interesting than I initially thought.

5 Likes

be careful with your timings though. timeit does report the mean, whereas btime reports the minimum run time.