Julia motivation: why weren't Numpy, Scipy, Numba, good enough?

I recently gave a Why Julia? talk to a roomful of computational fluid dynamicists and astrophysicists. (John Gibson, UNH & KITP, Julia - a high-performance dynamic programming language for technical computing, Kavli Institute, UCSB 2017-02-02).

I got two main skeptical responses. One Python expert in the audience pointed out that much of what I was touting in Julia is available in Python: Numba for fast array math, %jit macros for just-in-time computation. I read up on these things afterwards and now have some questions about why the Julia developers thought starting afresh was the right move. I lean that way myself, based on experience(in the long run, clean, total solutions beat patches on imperfect ones). And from what I’ve read there are compatibility issues between the multiple Python extensions (Cython, PyPy, Numpy, and Numba), and maturity issues for Numba. But I imagine the Julia development team has very specific and clear ideas of the shortcomings of Python’ that warranted starting afresh with Julia. If anyone has time to spell these out or point me to a previous description, I’d be grateful.

The other skeptical response was about parallelism. I’ll post about this elsewhere. Suffice it to say that people in my community want to see a working example of a parallel PDE solver, and for this end I am thinking of implementing a Julia code for isotropic turbulence in a triply-periodic box. Should be easy if I can get Distributed Arrays and parallel FFTW to work together.

best regards,

John Gibson

26 Likes

I’m not a julia developer, but to answer the question about ‘not good enough’ i think you need put the questions and answers on a timeline. The ‘why we created julia’ blog entry dates back to feb2012 and at that time julia was already existing ~2 years. Numba was not seen.

2 Likes

This is pretty good reading: Personal thoughts about Pyston’s outcome – kmod’s blog

"The thing I wish people understood about Python performance is that the difficulties come from Python’s extremely rich object model, not from anything about its dynamic scopes or dynamic types. The problem is that every operation in Python will typically have multiple points at which the user can override the behavior, and these features are used, often very extensively. Some examples are inspecting the locals of a frame after the frame has exited, mutating functions in-place, or even something as banal as overriding isinstance. "

Sure you can make Python fast in some cases where you use a small subset of the language. However, Python as a whole is extremely difficult to make fast because of the incredible freedom you have. The “compiler” can assume very little, so it is difficult to generate efficient code. Just consider the incredible amount of time and money that has been spent on trying to make it faster and the relatively mediocre success it has had (Pyston, Numba, PyPy etc).

This is also a good watch: Jeff Bezanson - Why is Julia fast? - YouTube

19 Likes

Not a developer, but I have these kind of discussions with colleagues (economists).

My take on this is that if a language/environment is “good enough” for someone, no amount of persuasion will convince them to switch. In my experience, people who gravitate to Julia do so because it really solves the problems they had with other languages (in terms of speed, elegance, abstraction), or expect that it will, given time, so it is worth investing in.

The ideal Julia user at this stage should be someone who, when reading the manual for the first time, will just scream in delight every few minutes, because they find a solution (or an architecture for possible solutions) to language problems they have encountered. Otherwise, the language will just look like an unnecessarily complex construct with syntax/semantics that changes every few months, and lots of work in progress code in the package ecosystem.

23 Likes

@kristoffer.carlsson agreed; that article about pyston was the first thing that came to my mind.

Another piece of the “why Julia” puzzle is the fact that Julia does not treat a specific set of built-in types differently. That is, I can make my own type or immutable that can do anything that Julia’s built-in types can do, and my type can be just as fast as the one provided with Julia (if I were as skilled at numerical computing as the devs, which I am not). This isn’t just a nice feature; it’s essential to making things like automatic differentiation and other numeric types fast and easy to use. It’s also why we have really nice abstractions like the idea that an image is not a matrix of doubles or bytes but instead a matrix of colors. That works because Julia lets us define a whole family of color-related datatypes which provide useful behavior and abstraction at little or no run-time cost. That’s been the biggest benefit of Julia, in my experience.

Oh, and also JuMP. JuMP is great.

16 Likes

I think Julia’s advantage is some synergy between its type system, fast code on custom types and new abstractions, syntax, multiple dispatch, macros etc

Thus, its not just fast on specific types of code, but allows for unparalleled flexibility, code reuse, expression of ideas and abstractions, higher order programming etc in a way that makes numba and python seem stone age.

Here is a concrete elucidating example of something that wouldn’t be possible in python: 6 Months of DifferentialEquations.jl: Where We Are and Where We Are Going - Stochastic Lifestyle

1 Like

This is how I understand the “why Julia?” question. This was my round 2+ after I didn’t have an adequate answer written down the first time I did this workshop.

http://ucidatascienceinitiative.github.io/IntroToJulia/Html/WhyJulia

Julia essentially answers the Pyston question: how should a language be restricted and thought about so that way it still looks and acts dynamic, but in the end is very easy to extend and optimize? When you understand how Julia’s type system allows for JIT compilation to not just work, but work very well by specializing on exactly the functions its seeing, then you see what makes Julia special and how to apply this design to your own codes to get similar results.

I was recently giving a talk about DifferentialEquations.jl, at a part where I was highlighting the use of odd-number types (SymPy expressions, ArbFloats, ApproxFun Funs) in the differential equation solvers. I got an immediate question from the audience along the lines of

I know from Python that if you objects inside of numerical methods / loops, it will immediately be very slow. And it’s difficult/impossible to get user defined objects in something like Numba. So while all of this stuff is really great for expanding what we can solve… won’t it be very slow?"

Of course I had to answer (along the lines of)

That’s a limitation of Python. Julia’s type system is specifically designed so that way user-defined types are just as fast as those which are “built-in”. In fact, Float64 is just a type which is defined inside Julia. You can, the same way Base was built, build up your own crazy numbers and arrays, have them be fast, and they will “just work” via dispatch. ApproxFun is probably coolest example: it’s type Fun is essentially a “Function in a function space”, but from Functional Analysis we know that those functions are just points (numbers) in this new space. So you can define +, -, /, *, etc. between them… and then just stick that “number” in functions or libraries written for Numbers and get out the solution to a PDE.

The key is understanding why the abstraction works. Because Julia recompiles what each of those operators mean, when you put in some complex type it essentially compiles the code you would have wanted to write. When you start to get the hang of this, your Julia code can become bizarrely abstract, and you can compose interesting type + generic function to get fast implementations with no work. A good example of this is SymPy.jl or SymEngine.jl. These implement symbolic mathematical expressions in Julia. When you first start using them, it seems odd that the documentation is so little. But then you understand that “oh, +, -, *, and / is defined on a symbolic expression, and therefore to take the inverse of a matrix of SymEngine expressions, I can call Julia’s Base inv function. Nobody ever needed to implement such a function because the way inv is written only requires those dispatches.” And when you benchmark it, it’s super fast (it can sometimes be faster than built in SymPy expressions…). All of the “linear algebra fallbacks” will make matrices of SymPy/SymEngine expressions “just work”. You can string together fast algorithms using methods nobody ever wrote but the compiler makes for you from generic code. That’s something you don’t get with Python.

All of this “zero-cost abstraction” allows large software ecosystems like JuliaOpt, JuliaDiffEq, JuliaStats, JuliaML, etc. to be developed very fast, but have amazing flexibility and performance. At this point, I think we recommend Julia for its packages, since there are quite a few things which Julia packages can readily do that I wouldn’t know simple ways of doing in other languages.

20 Likes

In addition to performance reasons indicated by someone, I’d point also to maintainability, mostly thanks to a well devised type system. I’m not a Python user, so please do correct me if I’m wrong anywhere, but I’ve got the feeling that it’s often tricky to make two classes from different packages play nicely together. At least, some glue code is sometimes required.

In this regard, I find the example of my package Measurements.jl instructive. There is a similar Python package: uncertainties. They’re both uncertainty propagation library, but they have different support for some features:

  • complex numbers. uncertainties: not supported. Measurements.jl: work out-of-the-box.
  • arbitrary precision. uncertainties: not supported. Measurements.jl: work out-of-the-box.
  • arrays of numbers with uncertainty and operation with them. uncertainties: support to NumPy thanks to a submodule written for the purpose. Measurements.jl: work out-of-the-box.
  • linear algebra operations with matrices: uncertainties: limited support: a submodule written for the purpose provide methods to compute matrix inverse and pseudo-inverse. Measurements.jl: those operations defined for any AbstractFloat work out-of-the-box, things will improve if/when issue #5429 will be fixed. Note this is not a limitation of the language, it’s simply that we need someone who writes the methods for generic types.
  • operations with numbers with uncertainty and physical units. uncertainties: you have to write glue code, Pint package does this. Measurements.jl: works out-of-the-box with any type providing unit feature, see Examples — Measurements.jl 0.4.1-dev documentation.
  • numerical integration. uncertainties: not supported. Measurements.jl: quadgk from QuadGK.jl already works out-of-the-box with functions returning arbitrary types, including Measurement. In Measurements.jl I only taught quadgk the derivatives of the integral with respect to the interval bounds, so that one can use also Measurement objects as endpoints of the integral.

When I say that Measurements.jl works out-of-the-box I mean that I didn’t add any single line of code to get that feature. They all came for free during development. This is a huge leap for the developer, and users as well (see second last point: Measurements.jl, nor the physical units packages, doesn’t impose a choice on the user, they can use whatever package they like).

15 Likes

I think this is a really key point. I would even go as far to say that most people likely “don’t need” Julia. Most users in the general realm of scientific programming likely are stringing together results from various packages. And for those people, sure if there’s already a package in Python, why switch?

The problem comes when there isn’t a package. The next group of users are those who wrote some simple libraries in Python/MATLAB/R. This is the group that gets buffed when trying to make Cython/MEX work, finds out that PyPy doesn’t work with some libraries they want to use, and tries to use Numba but runs into difficulties the moment they have “non-trivial codes” that aren’t just a benchmark loop and involve some objects. These are the people who find Julia and go “yes, this is what I needed”.

Overtime, these converts make(/made) some really interesting packages in Julia, and that gives the incentive for the “package user” group to switch. But I think that it’s really important to understand that the vast majority of users of the language will probably not know what “broadcast fusion” is or that it occurs, that in-place operations / mutating functions are the “fast way” to write array functions, etc.

And that’s okay! It just means that Julia, because it solves the two language problem, also has a two audience problem, and one should evangelize Julia by appropriately discussing Julia to the different audiences. Some people want to hear about compiler details and type systems for writing libraries. Some people want to see cool examples of solving difficult optimization problems in JuMP, complex simulations using DifferentialEquations, and easily train large neural nets using MXNet. Both of these groups are Julia’s (potential and current) audience.

(I have now written too much so I will stop :slight_smile:)

15 Likes

I guess I can give some historical motivation here. When we started working on Julia in 2009, the Python ecosystem was nowhere near where it is today:

  • PyPy was pretty new: 1.0 was released in 2007 – it was the first version to be able to run any substantial Python apps and there were no plans at the time for supporting any of NumPy;
  • Cython was even newer: initial release was 2007;
  • Numba didn’t exist: initial release was in 2012.

Even NumPy itself was a bit rough back then. It’s entirely possible that if the SciPy ecosystem had been as well developed in 2009 as it is today, we never would have started Julia. But then again, there are a lot of aspects of Python that can still be improved upon for scientific computing – and for general computing as well. Note, however, that Julia wasn’t a reaction to Python – @viralbshah has used Python and NumPy extensively, but I had not, and as far as I know, neither had @jeff.bezanson. Julia was much more influenced by Matlab, C, Ruby and Scheme, although we certainly have looked to Python for inspiration on some designs (I straight up copied the path and file APIs, for example). Python is often a good reference for basic sanity in language and API design choices. Here are some of the areas where we felt Julia could do better than existing languages, Python included.

Real metaprogramming. There’s no good reason for a dynamic language not to do this, but aside from Lisps, they don’t. I’m not sure why. The greatness of JuMP is due to Julia’s metaprogramming facilities – and, of course, the brilliant API design work of the JuMP folks to make effective use of metaprogramming to allow targeting different optimization backends efficiently.

Make numerical programming a priority. Technical computing deserves a really first-class modern language that’s focused on it. After a couple of decades, Python finally threw NumPy a bone and added the @ operator for infix matrix multiplication. Numerical computing is just not a priority for the Python core devs – and reasonably so, since that’s not the language’s primary focus. Julia, on the other hand, has been about numerical and scientific work from the start, so we care about usability and convenience in those areas. This is not a diss on Python in any way, it’s just a fact that numerical programming will never be a top priority for Python the way it is the top priority for Julia.

Have an expressive built-in language for talking about types. Every numerical computing library ends up needing to talk about types. At a minimum, you can’t do anything practical without typed arrays and being able to pass them to C and Fortran libraries like FFTW and BLAS – both of which require talking about types. As a result, there seems to be a Greenspun’s Tenth Rule analogue for numerical libraries, which I’ll posit as follows:

Any sufficiently complicated numerical library in a dynamic language, contains an ad-hoc, informally-specified, bug-ridden, slow implementation of a type system.

I just made this up, but it’s true – the SciPy ecosystem is littered with dozens of incompatible DSLs for talking about types with varying degrees of complexity and expressiveness. Python 3.5 has standardized on syntax for type hints, but from what I can tell, it still isn’t sufficient for what most numerical libraries need and probably never will be (priorities again) – it’s more designed for type checking in the presence of duck typing. If it quacks like a Float64 and walks like a Float64, is it a Float64? Not necessarily, if you really need your code to be efficient. Julia circumvents Greenspun’s Eleventh Rule (the one I just made up) by fully committing to having a type system and making the most of it for dispatch, types, error checking, code generation, and documentation.

Don’t make numeric types or operators special. What does it take to make “primitives” like Int and Float64 normal types and operators like + normal functions? This requirement leads you seemingly inexorably to dynamic multiple dispatch – which happily turns out to also be incredibly useful for many other things. Not least, neatly addressing the expression problem, which allows Julia libraries to share types and generic algorithms remarkably effectively. Multiple dispatch was a strange experiment at first, but it has worked out surprisingly well. I’m still regularly shocked that more languages haven’t used this as a paradigm before Julia. Dylan is the only language that’s taken multiple dispatch nearly as seriously as Julia. (And apparently Perl 6, from what I hear.)

Use modern technologies and design for performance from the start. These seem separate, but they’re not really. Language performance dies the death of a thousand cuts. No single design choice destroys performance – it’s an accumulation of little choices that kills it, even though each would individually be survivable. These choices are made in the context of technologies. Designs that are easy to implement with one technology may be hard to implement with another. So we should design for modern, high-performance technologies from the beginning.

Most dynamic languages were designed to be interpreted, and interpreters are relatively slow. So no one thought that much about speed when these languages were designed. Later, when you try to apply modern technologies like JIT and LLVM, choices that were made on the assumption that a language would be interpreted are often a big problem. For example, representing objects and scopes as dictionaries is a reasonable choice in an interpreter. From there, it’s perfectly natural to let people manipulate objects and scopes just like dictionaries – after all, that’s what they are. But making code fast requires making local variables and objects completely vanish – the fastest computation is the one you don’t do. But once you’ve committed to exposing dictionary semantics and people have written tons of code using those features, it’s too late – you won’t ever be able to effectively optimize that code.

Julia, on the other hand, has never had a full interpreter. When we started the project, we looked at compiler frameworks, picked LLVM, and started using it right away. If a feature wasn’t easy to generate code for, it was never added to the language in the first place. Often there are slight variations on features that make them vastly easier to generate code for but don’t really eliminate any expressiveness. eval that operates in local scopes? Impossible. eval that operates only at global scope but allows you to define new functions (which have local scopes)? No problem. A lot of Julia’s design fell directly out of making it easy to generate LLVM IR – in some ways, Julia is just a really great DSL for LLVM’s JIT. Of course, code generation frameworks aren’t that different, so Julia is really just a nice DSL for any code generator that can be invoked at runtime.

103 Likes

Hi John,

I was actually at that conference in Santa Barbara, and I know well the community you were addressing to (not at your talk, though). As a long-time Julia user, and member of that community, I guess that the best way to remove all skepticism would be simply to rewrite channelflow in Julia, and compare speed, lines of code, elegance, hack-ability, etc.

This seems so much a crazy idea that makes me want to start doing it myself!

Best,

Davide Lasagna

6 Likes

@StefanKarpinski, very enlightening writing.

I really like the fact that Scientific Computing is a first priority.
Going forward, do you think it is achievable, in the future, to write code in its Vectorized Form + Decorations in Julia and have performance of highly optimized (SIMD + Threading) C Code?

I liked the guide line that what sacrifices performance won’t get in.

Python, in my opinion, went too fast with its flexibility and types of variables.
Sometimes its code looks like Black Magic.

I prefer language with less built in types which sometimes means you code more but also the code is simple to grasp.
I hope Julia will remain simple in its Base.

@John_Gibson,
I think Julia’s charm is how deep it lets you go.
For instance, @inbounds macro, Choice whether to have SIMD or not, use Multi Threading or not, etc…
It means that on the surface you have the usefulness of MATLAB like language yet when you want to speed things up and get your hands dirty you can do it within the same language, Julia (As opposed going C in MATLAB).

Yet still, the challenge of Julia will be to close the gap of performance when writing Vectorized “MATLAB Like” code to what can a user get when going down.
Hopefully it is doable and if it is doable, Julia seems to be the best candidate to achieve it.

Thank You.

2 Likes

I think Julia is about giving you the tools to build these things. If you want “the tool for auto-optimizing scientific computing code”, that would ParallelAccelerator.jl

In some sense, that is the Numba equivalent in Julia. It’s interesting that people then compare Numba to Julia itself, because Julia is just a programming language and not an auto-accelerating engine (though some people try to treat it like that). ParallelAccelerator.jl is a lot like Numba, letting you decorate functions (with macros) as “this is a simple mathematical code with a lot of vectorized operations, please multithread/parallelize it”. Over time, many features from ParallelAccelerator have made it into Base (and that seems to be continuing), though not all of them have so there’s still a lot there that you won’t get from Base.

But I think that offering the full suite of auto-optimizations requires at some point you start adding “magic”, which I think is not what Julia what’s to go. Julia wants to always be concise and precise, with no overhead added. But it doesn’t want to be magical. My interpretation of the “current idea of what Base Julia should be” (as collected from reading PRs/issues. Note I am not a core dev but am trying to give my summary of what seems to be the emerging consensus) is that Julia should offer the performance of low order languages without having to leave Julia proper, and it should try to give you good syntax for doing so, and automatically apply compiler optimizations, but the results of code should be easy to reason about, with any other magic left to macros and packages.

5 Likes

@ChrisRackauckas,

Julia is a programming language yet was conceived with Scientific Computing in mind.
This means the audience is expecting to be treated different than “Julia allows you to do this and that…”.

As I said, It is amazing Julia allows you do that but on the other side of things Julia should close the gap and have things work as fast as it was optimized.
This is the real deal.

Hence, my question remain regarding Julia’s future:

Going forward, do you think it is achievable, in the future, to write code in its Vectorized Form + Decorations in Julia and have performance of highly optimized (SIMD + Threading) C Code?

Royi, here’s how I understand it :slight_smile: “yes, it [will be] achievable” in Julia but there won’t be default magic solutions. So: if you are willing to think about and add (possibly technical) decorations, you will be able to achieve high optimisation; if you want automatic magic, you will need extra packages. (Right now, threading seems to be technical/magical, but, as the design matures, some magics may become dependable and reliable in the future.)

My summary of this thread would be: in Python you can work hard to make some vanilla parts fast; with Julia you can make anything fast.

But Julia will make it easy, not do it for you.

4 Likes

Yes, I already answered that. Write code in a vectorized form in Julia, put using ParallelAccelerator at the top of your code, add the @acc decoration, and you will get highly optimized SIMD + Threading C code. You don’t need to wait for the future: you can do that today!

People coming from MATLAB, Mathematica, and somewhat Python and R (due to their expansive Base/SciPy libraries) seem to have a very expanded sense of “what is the core of a scientific computing language”. Julia early on did as well, but there has been a large recognition that there is no reason for everything to be in Julia’s Base. Instead, scientific computing revolves around the many different algorithms which specialize to the problems people are trying to solve, and this massive amount of algorithms is better handled by a robust package system. Since Julia doesn’t privilege Base (defining types in packages has the same no-low overhead as they do in Base), these things might as well be packages, separated from the Core of the language. Since using a package only take 1 LoC (Pkg.add("ThisPackage"); using ThisPackage), it’s not like it really adds user burden. Then question that should be asked is no longer “what could be added to Base” as much as it is “what should be or needs to be parts of Base?”.

There doesn’t seem to be a solid line drawn for “what should Base be”, but for example in v0.6 numerical quadrature is no longer part of Base, but it the well-functioning well-maintained essentially no different from when it was in Base package, QuadGK.jl. Again, some people coming to Julia might go “but I expected Julia to just have a fast, parallel, …, method for numerical quadrature, eww why is it a package?”. I think the answer is along the lines of: why not make it a package? There was nothing about Base that required it to be there, and most codes don’t use it. And going further down the line, even things like linear algebra may be moving out to packages, except as “default packages”. So any idea that “the answer is in a package, so it’s not really answered by Julia” is very wrong headed in some sense, since these packages are written in Julia themselves and are no harder to use than Base code.

But further discussion on this topic should be a different thread. There has also been a lot already written on this topic, so you may want to search around Google for “default packages” and “standard library” in Julia.

1 Like

@ChrisRackauckas, I really appreciate you effort but you’re totally missing my point.

I’m talking about the core language and core language in my point of view is arrays and the basic math operations (In Multi Dimensional Space, Like Linear Algebra).

Namely, C isn’t base for me just because operation on arrays aren’t built in.

Now, this is the context of my question and it was a theoretical one.
Namely, how much data from the programmer (Explicit and Implicit) does the Julia needs to create a perfectly optimized Machine Code.

Is that amount of data is something reasonable to have in few lines of code?

This is the question.
It has nothing to do with packages and please don’t refer me to those.
I’m asking about the concept of Julia.

The way I see it, the concept of the developers was how much we can simplify (Make it User Friendly) the data a compiler needs to create the most efficient code.
If at the end Julia will teach the world this amount of data is only few decorations and Language assumptions it will be huge.

If not, then probably the standard way of doing specific things fast will stay with us until the next time someone tries it.

2 Likes

Isn’t it getting a bit offtopic now? I appreciated the discussion related to Python. - Maybe the posts starting with "Going forward, do you think it is achievable, in the future, to write code in its Vectorized Form + Decorations in Julia and have performance of highly optimized (SIMD + Threading) C Code?" could be splitted into its own thread?

3 Likes

@swissr, I think it is related to Python.

Python (As maybe MATLAB as well) doing things a little differently.
They optimize Hot Spots in the language.

Julia is trying to be born fast.
The question is, trying to be overall fast, does it mean that also in hot spots be as fast as others?

MATLAB and Python just write things in C (What’s needed to be fast).
Julia is more like an intelligent framework to tell a compiler how to create efficient code form simple intuitive syntax.

If you address the OP question, it seems only if the the way Julia chose doing thing will end up being faster there is a logic in it.

3 Likes

The very simple answer is, if you use the -O3 flag, then the compiler already automatically will use SIMD when appropriate (try a simple loop. Make sure you’ve re-built the system image or made Julia from the source). As for adding threading, the answer looks like it will be some macro (and you know about this since you’ve commented in the issue):

But…

What I am trying to get across is that those things will actually be packages. There are already many discussions about how to move the linear algebra parts out to a package, and it looks like it will be done before 1.0 (it’s waiting on the implementation of a system for default packages). Even the basic math functions like sqrt: Julia currently uses a version of OpenLibm, but there are some very advanced experiments with Julia implementations like Libm.jl and Amal.jl for replacing all of this basic math with packages (some packages may be loaded by default, but they will still just be packages).

So what I am really trying to hammer home is that the idea that “no it’s not Julia if it’s a package, let’s only talk about ‘Julia’” as though there is some privileged Base Julia which rules above all others: that idea is very wrongheaded. I think it’s safe to say that very soon, all of what you think of as “basic math operations” will be “default packages”: just standard Julia packages which by default are loaded into the system image.

So yes, even your examples of what do not refer to packages, actually refer to what will be packages. That means I think it’s safe to say you cannot talk about Julia as though it’s isolated from its package ecosystem. I think at this point, the concept of Julia includes packages, as seen by how things which were considered essential enough to be in early Julia have now moved out into packages. In that sense, you can already do what you’re asking in Julia using ParallelAccelerator, and the equivalent to Python+Numba in Julia is actually Julia+ParallelAccelerator.

Yes, that question hijacks the thread and all responses after it should be a separate thread about the future of auto-optimizing (SIMD, multi-threading, etc.) Julia code. Auto-optimizing Julia code a la Numba is something related to “making Julia fast”, but in some sense that is more about making Julia match hardware, not the design of Julia itself and why it makes sense as a way to specify scientific programs.

6 Likes