Julia vs R vs Python

performance

#21

That’s called a major release.


#22

Read my post again and see it is not.

If it was, give me the credit I would say so.


#23

But by definition… it is… You exactly described what a major release in semvar is: a breaking change. The problem with Python was that Python 3 took a very long time after Python 2 and it was almost impossible to support the two. Julia has a lot of mechanisms in place (depwarns, Compat.jl, etc.) to make major releases easier to handle. But yes, I was just mentioning that you’re just saying “I think we should have a quick 2.0, and other major versions should have a steady and quick release cycle” in a more verbose way. This is something that has been mentioned in previous roadmap talks and I think that the Julia community tends to be less conservative than something like Python with breaking changes. Though it will be nice to have a 1.0 so this can start happening in “bursts” every few years instead of every few months…

Honestly though, there’s so much that can be done for scientific computing that doesn’t require a breaking change that I kind of wonder what’s left. What kinds of breaking changes are you looking for?


#24

You insist and I’m saying this is not what I meant.

Look at the video of Armin Ronacher @StefanKarpinski posted.
You can see a list of things he mentioned that are the way the are for years and nobody will change them in the name of compatibility which hurts Python speed significantly.
In that sense even Major Release didn’t go far enough because of compatibility (Namely keeping compatibility).

What I meant is not small fixing around the corners which might break some old code.
My point was a policy of the language to stay faithful to its objective and not compatibility.
What is need to be broken in order to build it right in the second (Third, fourth, etc…) will be broken.


#25

Python won’t break compatibility to make things faster because that’s not what it’s objective is. The kinds of changes in the video are perfectly reasonable things to add for a major release, but they won’t be added to Python because its main target user groups are looking more for the features that would be lost due to speed constraints than the speed itself. Speed is pretty clear to Julia’s objective, so bringing up the fact that Python doesn’t make breaking changes in the name of speed as a reason to suspect Julia won’t doesn’t make sense.

Changing all . operators to use into an anonymous function and broadcast in the name of performance is pretty huge and breaking change (which had a lot of casualties like Knet.jl). Changing all anonymous functions to be individual callable types which autospecialize in function calls is a huge and breaking change but resulted in fast anonymous functions. So of course Julia has a commitment to doing what it’s already doing.

But adding NamedTuples to allow keyword argument dispatching, giving a fast path for small unions, etc. are all upcoming and (mostly) non-breaking changes which add a lot of ways to optimize speed. Lots of other things, like macros for declaring local non-aliasing scopes, can be added to give speed in a clear and concise way without breaking code. Most of the changes that are lined up which are speed improvements (and that people ask for) actually seem to be non-breaking. So again, if you want to be push the breaking agenda, what exactly are you proposing to break and why?


#26

Here’s the thing: we’re just not in the same situation as other high-level dynamic languages with regard to legacy making things slow. R, Python, Matlab, etc. have decades of assumptions and features based on the internal details of a slow implementation. Julia was designed to run fast with a JIT in the first place. We don’t have to painfully remove features that make it hard or impossible to generate fast code – we’ve never added those features in the first place. If there are language changes that we can make in the future that can make Julia even faster, they will be at the top of the list of changes to consider in Julia 2.0. But the truth is that there are so many pure optimizations (by definition, non-breaking) that we haven’t even started to experiment with that it will be quite some time before we need to break things for performance.


#27

Well, you know, years will come and go and people will learn new things.
In retrospect, in few years, something which is right today might be sub optimal.

But the spirit you’re conveying with your words that’s exactly the policy Julia should stick to.
Julia is here to be at the front of what’s known in order to solve the languages problem.
It will do, redo, bread and rebuild what’s needed for that.

I think in the years to come this policy will make sure Julia won’t be another one of many, it will shine as different (Hopefully different and successful).


#28

@StefanKarpinski, Any chance for more videos in the spirit you posted above?

I’m not a specialist of those things but I really find it interesting.
Anything on other languages architecture? Anything more on Python, Julia?

Thank You.


#29

Not off the top of my head, but I’m sure there’s more stuff out there.


#30

For me the thing about Python is not so much that it’s badly designed (it isn’t) but rather that it has been radically misappropriated for use in things that it has no business doing (i.e. scientific programming). It is a testament to Python’s good design that this program has been successful at all. Ultimately this has left us with a bad situation however, so it’s past time that it stop.

Oh man, that’s a hell of an insult.


#31

No even on Julia?

@ExpandingMan,
I don’t think the talk above talks in the context of Scientific Computation.
It is in general.
It doesn’t prove it was designed wrongly, but I hope there are really good reason for that.


#32

You left in a typo when introducing the abs2().
It should be if abs2(z) > 4

FYI: I tried with adding two numba @autojit decorators to the python version and it ran within a factor of 1.5 of the Julia code when using abs in Julia and Python. Julia is about 2.5 x as fast as numba when using abs2 for which I did not find a numba/python equivalent.


#33

Wow,

Could you share the code?
So Python can get really fast.


#34

All you need to do is the autojit import from numba and
then add @autojit “on top” of the function definitions you want
to be jit compiled.

import time
import numpy
from numba import autojit

detail = 100; h = 6000; w = 6000
bitmap = numpy.zeros((h,w))
c = -0.3819660112501051 + 0.6180339887498949j

@autojit
def fractal(z, c, lod):
    for i in range(lod):
        if abs(z) > 2:
            return i
        # z*z is not as good looking as z^2 but more performant
        z = z * z + c
    return lod;

@autojit 
def generate(bitmap, w, h, c, lod):
    for x, y in numpy.ndindex(bitmap.shape):
        # too bad we can't just append 'j' to make it complex
        z = complex(3*(x+1-w/2)/w, 3*(y+1-h/2)/h)
        bitmap[y,x] = fractal(z, c, lod) / lod           
    return bitmap;
                
t1 = time.time()
image = generate(bitmap, w, h, c, detail)
t0 = time.time()
print(t0-t1)


#35

Note that @autojit works well with the numpy matrices but not with things like list comprehensions and other things. So it’s not a “make python fast” button. These tradeoffs are all difficult.

For instance the dependencies on the refcounter in Python is one thing preventing PyPy from being more widely used but it is very useful for handling memory on the GPU which is invisible to Julia’s garbage collector. It’s not always a zero sum game.


#36

Sure. But with basically no effort you can get reasonable performance out of python in quite a few cases.
(I think they’d call it Julia if it would work in all cases… :wink: )


#37

Most of the “Speed is Crucial” work will be with numerical arrays which means Numpy in Python (At least in most Scientific Computing work loads).

So if Python, in those work loads, is fast using Numba, it is great to hear!


#38

This is also the easiest code to just write a piece of C/Fortran for. The problem is when your code isn’t that easy, which usually comes up in package development moreso than on the user side.


#39

Numba is a great and very useful project, but it’s pretty limiting to be restricted to Fortran-77 style for all performance-critical code: no user-defined types, only the built-in array type of built-in numeric types, relying on structures of arrays for everything else.

And it’s certainly not true that all (or even most?) speed-crucial code these days falls into that category. There’s a reason why all major Python finite-element (FEM) packages rely on some form of custom C++ code generation, for example.


#40

How the song goes:

First We Take Manhattan…

First, give me fast processing of array composed Double / Single numbers.