General questions from Python user

usually you write python code, an example of 10 lines nanmean in julia that works on general <:AbstractFloat number (even user defined ones), and the 100 lines of code in C++ that is used in python (which is much faster than numpy’s nanmean) that is slower than Julia and only works on Float32/64)

7 Likes

Hi friend! Welcome to the Julia Discourse! I asked a similar question a while back that might be of use to you: Why Are Languages Like Python and MATLAB So Much Slower than Julia?

Basically, I was asking about what makes Julia faster - in general - compared to languages like Python or MATLAB. All the answers here were great!

4 Likes

Even compared to simple vectorized Numpy operations Julia 1.6 is often faster (numbers in the plot > 1):

timings_1.6

Source code:

2 Likes

Welcome. It is good to see new person from the Python sphere, it is still the language that I know the best.

First of all Julia is quite similar to Python on the surface syntax level, but there are noticeable differences. At the same time there are very different when you go deep. My college, who is physicist, starts using Julia thinking that is Python that is fast and as he himself notice, it was wrong attitude. But, when I was talking to him about it last time, he still doesn’t get Julia mindset for programming.

Second, against what Python code you measure you speed? Do you base your program on pure Python or calls to libraries? In Python counting chars in 1 MB string by string_var.counts(“A”) (or something like that) will most probably outperform by far (10, 20, 30, more times?) any thing that you can write by hand using for loops. If you write good Julia code by hand for the same task, it will take below twice time of string_var.counts(“A”). I would suspect very similar time basing on the fact that Python libraries are careful written in C (or similar language), but performers at such peak is not easy things and you can find unexpected regressions in performance.

Writing good code for most part means that you follow already mentioned many times performance tips. Julia documentation is quite dense, but for my taste following this tips can be compared to using Python’s PEPs. I know that “make indent four spaces” is much easier to understand that “avoid containers with abstract type parameters”, but I believe in practices you can be quite well off just memorizing few basic cases. Like “avoid containers with abstract type parameters” → avoid Real, Complex, Abstract Float, etc., use Int32, Int64, Float32 and Float64.

There is one of very big differences between Python and Julia. When using Python you get some rules of the game, you follow them or you need to code in C. In Julia you probably dig deeper, look behind the curtain and under the hood. If you as good as Chris Elrod you can improve your code performance in some amazing ways.

I don’t know if this is an answer to correct question, but I hope that it will help. If you want to hear about it, I will try to help as much as I can. And where I can’t smarter people than me will most definitely could.

While I think most of the comments in this thread are correct, I think they can be a bit misleading.

This Discourse community is full of threads from new users who thought they could basically just write Python or Matlab code in Julia, make some syntax tweaks and have it be orders of magnitude faster.

This is not a very realistic expectation. Python and Matlab in particular are two languages that strongly encourage a programming styles that are antithetical to high performance compilers.

Julia can be incredibly fast while being easy to write and having syntax similar to Python, but Julia is a different language and you still need to learn to write Julia code to take advantage of its speed

17 Likes

You can write Julia code much like you wrote Python code, and it could be much faster (Julia code against native Python loops), or much slower (e.g. naive Julia against numpy). It is not really hard to get Julia to be fast, but it may take time to learn the proper Julian ways. As an example see this recent thread, where the community helps a novice to improve the speed of his code by a factor of 70.

4 Likes

Plain Julia is not in general slower than Numpy (when your code is type-stable), see my benchmark above.

2 Likes

naive is not the same as plain

1 Like

I tested the most simple way in Julia for element-wise operations using dotted operators, e.g.

a = rand(1000)
b = rand(1000)
y = a ./ b
z = exp.(a)

This is nearly the same syntax as in Numpy (just adding dots for broadcasting and not writing np. in front) and is also imho the most naive way to do it in Julia.
The only operations that were slower in Julia than in Numpy for me were 2-element additions and multiplications for (Float64) array sizes 10,000 to 100,000 (smaller and larger arrays were faster in Julia) and divisions for array size 100,000, all other tested operations were faster in Julia than in Numpy.

7 Likes

I’m yet to see a naive Julia vs. numpy (when the task is not memory/BLAS bounded) where Julia is MUCH slower. Unless by naive you mean intentionally type unstable. Example please?

1 Like

One could argue that the most naive way of doing this in Julia is:

julia> a = rand(1000); b = rand(1000);

julia> y = zeros(1000);

julia> @btime for i in 1:length(a)
         y[i] = a[i] / b[i]
       end
  92.402 μs (4980 allocations: 93.45 KiB)

which is of course pure-Python-painfully slow. Although probably that is not the kind of error an experienced python-numpy user would do.

Just for completeness, the not-deliberately type unstable function is not that bad:

julia> function f(a,b)
          y = zeros(length(a))
          for i in 1:length(a)
             y[i] = a[i]/b[i]
          end
          y
       end
f (generic function with 1 method)

julia> @btime f($a,$b);
  1.412 μs (1 allocation: 7.94 KiB)

julia> @btime (y = $a ./ $b);
  665.148 ns (1 allocation: 7.94 KiB)

1 Like

you’re benchmarking in global scope:

julia> function f!(y,a,b)
       for i in 1:length(a)
         y[i] = a[i] / b[i]
       end
       y
       end
f! (generic function with 1 method)

julia> @btime f!(y, $a, $b)
  1.133 μs (0 allocations: 0 bytes)

on purpose

2 Likes

:man_shrugging: assuming people purposefully use global scope to perform unrealistic task just to show Julia is slow, I guess this demo is applicable.

I just meant to show that if one writes type-unstable code (in this case, on purpose), performance is bad, and probably that is what “naive” may mostly mean in Julia.

1 Like

By replacing zeros with similar (which just allocates the output array without setting it to 0) and adding @inbounds the function gives the same performance as broadcasting:

function f2(a,b)
    y = similar(a)
    @inbounds for i in 1:length(a)
        y[i] = a[i]/b[i]
    end
    y
end

But I agree that this is no longer naive :wink:
Coming from Python / Numpy, broadcasting seemed to be the most natural way to do “vectorized” operations in Julia.

5 Likes

but even in this case julia is still faster:

In [7]: %%timeit
   ...: for i in range(0,len(y)):
   ...:     y[i] = a[i]/b[i]
   ...:
   ...:
229 µs ± 2.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

julia> @btime for i in 1:length(a)
         y[i] = a[i] / b[i]
       end
  97.144 μs (4980 allocations: 93.45 KiB)

you can’t assume user knows to vectorize in numpy but don’t know broadcasting in Julia.

1 Like

Exactly. This is correct. It is easy to write fast Julia code, if one is aware of the basic stuff. But there is an initial set of things to learn and get used to (as expected for everything that does not do magic).

I think we do not disagree in anything.

1 Like

Naive implementation usually also includes slicing without views. Since numpy slice are equivalent to Julia views, direct translation of numpy code can result in huge allocations and as a result very bad performance.

7 Likes