Reluctance to switch to Julia; PyPy and Cython

Thomas_Benardo · July 15, 2018, 2:57pm

Im quite reluctant to completely switch to Julia and the main reason being static compilation.
Python is fast enough with {fully-compatible} PyPy(close to Julia if given chance to warm up)

{limited} Python can be statically compiled using shed-skin(faster than Julia, almost anytime)
{fully-compatible} Python can be compiled using Cython for a enormous speed up(faster than Julia if used wisely)
{limited} There is RPython metacompiler which is also very fast(comparable to Julia)

and if i can compile python which can give me freedom to code normally(no obfuscation required) and a little bit extra overhead of using extra tools to speed up python, i think Julia isnt worth it! So Julia is fine but cannot replace Python.

ChrisRackauckas · July 15, 2018, 4:31pm

Other than the fact that you have code obfuscation since you don’t have generics, a parametric type system, and compile time controls. Oh and you don’t get interprocedural optimizations. These are all of the things required to make nontrivial codes performant. As mentioned before, if you are writing simple scripts with standard floats these Python tools are fine, but at this point that’s the scientific computing of the past.

braamvandyk · July 15, 2018, 5:50pm

Another need for static compilation is to write add-ins for commercial applications. For example custom reactor models for process simulation packages.

Thomas_Benardo · July 16, 2018, 2:26pm

Actually Cython is production ready and it is used far more complex programs than floats. I think i dont need to list the good scientific computing libraries for python, everyone knows python has one of the best libraries for the same(+ very thoroughly optimized). A smart programmer would not require interprocedural optimizations, because we can easily memoize functions which we think is the bottleneck. But apart from that, there really is no other optimizations the programmer can do without tweaking much code but actually thats enough. The reason no more optimizations are required is that cython passes the translation to one of the most optimizing compilers available till today(gcc/clang/mingw), so we have a lot of "compile time control"s. Then that compiler inlines function calls, eliminates dead code, etc. Cython translates python code to C with programmer’s choice of type(you can choose dynamic as well as static typing). And Cython is 100% compatible with python(actually its a superset of python). And its common to see both compilation and runtime of a cython+numpy script to outperform Julia’s because of aggressive optimizations and freedom with types. But i like Julia more than python and i would like to completely switch to Julia if only a static compilation added. It sure would not do any harm, it will just add to the features of Julia!

PetrKryslUCSD · July 16, 2018, 2:47pm

I am really doubtful of claims of cython being in general faster than Julia. I have used cython in an FEA package, and the performance was still worse than MATLAB (for the same problem, obviously). Julia beats MATLAB handily in my applications.

StefanKarpinski · July 16, 2018, 3:23pm

These posts have a fairly strong whiff of somewhat blind Python fandom. Which is fine and all, but you did necro-post on a seven-month-dormant thread, so here are a couple of counter-points:

You mention both PyPy and Cython. However, these are not compatible as far as I’m aware. Which do you actually use? There is also Numba which is yet another incompatible approach to speeding up Python. They all have quite different pros and significant cons—and you have to choose between them so you cannot get the best of all worlds at once. A big part of the premise of Julia is to get the benefits of all of these approaches in a single system by default without compromise. If that’s not compelling enough, ok, but that’s what you get.
Last I was aware, PyPy sometimes gets to C speeds but more often is around 1/5 the performance of C, not due to any shortcoming in the PyPy implementation (which is very impressive), but due to the many inherently slow, unpredictable dynamic behaviors that are just part of the way Python works. See this presentation by Armin Ronacher (creator of Flask and PyPy contributor) on why it’s so hard to optimize Python because of the very design of the language.
PyPy is still not 100% compatible with NumPy and other numerical Python libraries. The whole PyPy/NumPy strategy is, as far as I’m aware, to just reimplement all of NumPy in RPython. That’s a huge burden, as witnessed by the fact that it’s still not done and they started around the time we announced Julia (2012). Worse still, it’s inherently only ever temporarily done: any time there’s a new version of NumPy released, the same work that went into the NumPy release needs to be repeated to support PyPy; and it’s not a straight port since NumPy is written in C++ while PyPy’s version is written in RPython. It’s also unclear how many people are using the whole NumPy/PyPy stack, so it’s hard to say how well maintained it is going to be.

This whole discussion of Python seems fairly off-topic for the original subject of this thread. The only connection I can see is that there is an extortion argument along the lines of “Python is great for me; unless Julia adds static compilation, I will use Python.” That’s cool—carry on with Python. We will certainly support fully static compilation at some point in the future as resources and priorities allow, but not because of this kind of threat. I’m going to go ahead and split this discussion off into a separate thread since it’s off-topic.

ChrisRackauckas · July 16, 2018, 5:03pm

No, memoization is not a fix all here. For clarity, memoization is simply storing input->output mappings for future reuse. There are cases where this applies, but it’s not pervasive. If you want to speed up codes where you’re doing a lot of ODE simulations with the same options but different initial conditions, or solving a ML model like a random forest with a few top level literal options, none of the internal parts will likely ever see exactly the same floating point calculations and thus memoization wouldn’t speed anything up here (in fact it would slow it down since it would build a useless table).

I realized that to fully explain how what your saying makes no sense requires going into technical details and describing how the language features allow for a ton of optimizations and package building utilities. This is too much for a forum post, so I’ll post a blog post soon.

Joshua_Bowles · July 16, 2018, 7:36pm

As a newcomer maybe it’ll help if I share my story. (And also this seems to have devolved into an opinion piece). I entered the AI world with LISP and Prolog, and took part in and observed the rise of python. I also had a front row seat to to the rise of ruby on rails.

I see many of the same environmental, cultural and tech forces at play around Julia, except wrt the math and science domains, which I also see practical AI (data science, computer vision, natural language processing) needing to borrow more and more from.

The rise of python and rails happened in spite of c++ and Java. I see Julia having a high probability doing the same thing, ironically in spite of pythons success. Julia seems to lower the barrier of entry into computational math by constructing a good type system and exposing all the low level code to anyone.

I don’t chose Julia cause it’s the next python, it’s just the next step in tech progress IMO. I could still do many things with python but I chose not too because I am betting that Julia is the future.

Im actually much happier with static type systems and AOT compilation, but I understand the design decisions in Julia and I’m cool with it. I’m actually much h happier solving my problems than I am using AOT so i make the trade-off.
For me Julia is an investment in the future and for that it’s worth the cost to switch. But I’ve also gained more insight into how programming languages work and broadening my understanding by learning Julia.

Hope that helps give another point of view and an example for why I personally switched, I’m sure I’m not the only one who feels this way.

Thomas_Benardo · July 17, 2018, 2:49am

Just to clarify the confusion, Cython compiles python to C - Cython: C-Extensions for Python
Naive Cython wont optimize much, you have to specifically tell it where and what to optimize(it needs to be used wisely) The Performance of Python, Cython and C on a Vector — Cython def, cdef and cpdef functions 0.1.0 documentation . I was referring to Cython. And thats outdated information, now PyPy supports most libraries after installing pip, there was incampatibility because of cffi, but numpy and most other libraries have cffi bindings. Anyways I understand, my arguments were praising python too much, but I’m still a python programmer who recently started using Julia, annotating types in cython for speed ups are a pain, you need to know C and you have to waste sometime which you could have utilized somewhere else, Julia does it without effort on my part. Just curious, does Julia beat LuaJIT. Also if static compilation is not going to be added, I think an Intermediate Representation of the code would be beneficial, like python compile .py to .pyc/.pyo only once and we can distribution the .pyc which is byte code. So that would also help in securing our code. Also there will be advantage of Julia, it wont have to parse the script each time.

kristoffer.carlsson · July 17, 2018, 10:24am

Julia already does this with precompiled modules (.ji files).

Liso · July 17, 2018, 10:43am

Could new Pkg.jl (automatically) precompile installed packages?

kristoffer.carlsson · July 17, 2018, 10:49am

It has a command called precompile which precompiles all packages that needs precompilation in the project. So you can do add A B C; precompile. Some people have requested automatic precompilation just by writing add A but I am not sure that is the way to go. It seems better to opt in to significant extra work.

Tamas_Papp · July 17, 2018, 10:52am

I find this fantastic, and also totally sufficient. I would not prefer automatic precompilation.

Diego_Javier_Zea · July 17, 2018, 10:59am

Can precompile do the precompilation of different packages in parallel? (I was thinking in something similar to vim-plug parallel installation.)

Liso · July 17, 2018, 11:31am

I like it too

But prefer automatic recompilation.

Maybe some explanation why not to do it automatically would be good too.

(my motivation is that plenty of examples are too annoying to beginner because more obstacles than necessary)

Thomas_Benardo · July 17, 2018, 11:39am

Well, precompiling can help my code to be secure enough. That has resolved my issue, I didn’t know Julia can precompile them. But if there are any future plans for adding static compilation, it would be good.

ssfrr · July 17, 2018, 1:30pm

Unless I’m misunderstanding, the user doesn’t opt-in to precompilation, it just happens when they run using A instead of add A, right?

kristoffer.carlsson · July 17, 2018, 1:35pm

Yes.

airpmb · July 18, 2018, 2:59am

+1 for optional automatic precompilation. +2 for automatic precompilation that happens seamlessly in the background, somehow.

+100 for automatic precompilation done on someone else’s machine, is cached, and then sent to mine when I add, such that precompilation appears to magically take zero time!

StefanKarpinski · July 18, 2018, 4:20am

It may not be obvious to people but package precompilation is not like separate compilation of shared libraries in C or other compiled static languages. The validity of the precompiled files depends on the context in which the package is used—the precise version of Julia and of all packages that it depends on, directly or indirectly. That context may not be the same when you install a package version as when you use it. So automatic precompilation on package add runs a significant risk of compiling the package twice: once when you add it and again when you use it. And precompiling a package version when you add it to one environment does not mean it won’t need to be precompiled again when you use that same version from a different environment. So if you think that automatic precompilation on install eliminates precompilation on use, that’s just not the case. On the other hand, precompiling packages when they are used means that there was at least one time that the package was used in that context, which makes it much more likely that it will happen again and the compilation effort will not have been totally wasted. All of this is why there’s some reluctance to do this. We can certainly try it, but it may not work as well as people hope.

Topic		Replies	Views
Julia static compilation General Usage question	38	14792	February 15, 2019
About static compilation and static analysis Internals & Design	22	3562	September 28, 2022
Statically compiled and statically linked General Usage	35	10016	September 8, 2020
What don't you like about Julia for "serious work"? Community	176	25910	April 20, 2024
Building stand-alone helloworld produces many dlls, being 200MB big New to Julia	37	6464	November 18, 2021

Reluctance to switch to Julia; PyPy and Cython

Related topics