How hard would it be to implement Numpy.jl, i.e. Numpy in Julia?

As @sdanisch said, I was comparing to what Pythran/Numba etcetera do now. The limiting factor here is Python semantics — you can’t fuse loops as an optimization unless the compiler (or transpiler) can prove that it won’t change the results. In practice, this can only be done for a small set of types and functions recognized by the compiler, which is why things like Pythran and Numba are effectively limited to numpy arrays and a particular set of library functions on those arrays, and get stymied if they see a call to arbitrary Python code. A Julia transpiler or backend would have zero advantage here.

In fact, I wrote a long blog post explaining Julia’s approach to loop fusion for vectorized operations, the challenges of this problem, and how we are able to do it for generic user-defined functions and types. But the advantage we have here is lost if the front-end code is Pythonic.

You can write type-generic Python code, but you can’t compile it to fast code in general, because the semantics of the language do not allow it. This is why Pythran, Numba, etcetera can only do a good job on a very small subset of the language — only one container type (numpy arrays), only numpy scalar types (or specially annotated struct-like types using language extensions), and very limited polymorphism.

The question is how much information is easily accessible at compile time. Even though every program in every computer language is “fully described” in the sense of prescribing a deterministic sequence of actions (modulo stochastic algorithms and undefined behaviors), there is often lots of information that is nearly impossible to get in advance without actually running the code.

Basically, Julia is designed to provide more information at compile time than traditional dynamic languages. This occurs because of lots of properties of the language and standard library that Python doesn’t have — type-stable libraries, final concrete types, parameterized immutable types, and so on.

Realize that people have tried for years now to develop an effective general-purpose compiler for Python. It’s a hard problem! Why do you think that projects like Numba and Pythran have targeted such a small subset of the language?

The backend of a Python compiler is the easy part, and something that a Julia transpiler would add nothing to — the challenging part is the front-end analysis.

33 Likes