No one person can perform a “kvetchfest”—it’s a collective activity when everyone decides its a good time to complain about the same thing, which in this case is compiler latency—something that is already well known to be a problem. Unfortunately I’m not really the one to say more about it but I have asked @jeff.bezanson to say more. The nature of compiler work is that you don’t see much surface change but that does not mean that a lot of hard work is not and has not been done. Many packages have seen an order of magnitude reduction in startup time, but not all of them—these things are a trade off and some have gotten slower by a bit instead.
Thanks for the update, I look forward to future progress. Just to clarify, I’m less worried about “compile time” than the following two things:
usingtime (the 2nd time, precompile time is not a concern). This effectively limits the “stack depth” of package dependencies, as at some point I’d rather wait a year for Julia v1.7 than 20 seconds every time. PackageCompiler.jl already solves this, though I haven’t figured out how to incorporate it into my workflow when packages change. Having a timeline would let me know whether to invest energy into learning PackageCompiler.jl or to just wait it out.
- Type-inference bugs that trigger minute-long compiles or sometimes crashes (see for example this recent issue). This has the effect of pushing tests over the limit on Travis, or hours of trying random changes to work around the bug. Unlike item 1, there is no clear workaround other than to wait, but of course Jeff has limited time.
I am not sure what to make of these as alternatives — improvements arrive with the new releases (unless you follow eg
master, but then you spend quite a bit of time compiling already).
But given 3 minor releases a year, you are unlikely to get 1.7 within a year anyway.
Sorry, I was being facetious. The main question is “should I invest time in setting up PackageCompiler.jl or are improvements around the corner?”. It sounds like the answer is “invest time in PackageCompiler.jl”.
(I’d be happy to compile master regularly. Compile time is not the problem, it’s waiting time after resetting Julia which is a regular occurrence in Package development. )
Just my 2cents here: While the compiler latency and non-reactivity on REPL are closely coupled, i disagree that the complaints were targeted at compiler latency. The whole discussion is about: Are there creative solutions - outside of ‘just’ making the compiler faster - to give a more reactive REPL? The line in “Why we created julia” is “We want it interactive and we want it compiled.”.
The outline from Stefan’s post linked above is still broadly accurate. Our priority at the moment is multithreading, and as soon as that mostly works we will return to focusing on latency. Looking at the timeline of all this, I can’t help but agree that progress has been slower than I expected.
Latency is a difficult, multi-faceted issue. You are quite right to specify which particular kinds of latency matter to you, because there are a few separate, mostly-unrelated sources of it: (1) package loading (consisting mostly of method table merging, and a bit of re-compilation), (2) the general speed of type inference, (3) type inference bugs or quasi-bugs that cause it to run an exceptionally long time, (4) front end (parsing and lowering; not the biggest issue right now), and (5) LLVM optimizations. Again, these are all mostly unrelated and different packages or workflows can hit different ones.
While there have been a few modest commits to master that chip away at this, there is an iceberg underneath of things we have tried, experiments run, and of course more things we are planning to try. Some things we try don’t work, or have no effect, or have a much smaller effect than hoped. Some things give nice improvements, at the expense of e.g. worse type information. Because of this it is very hard to promise “X% improvement by Y date”. (Side note: in case anybody still doesn’t believe that
return_type is a bad idea, using it means we can’t speed up the compiler without breaking your code. )
In the hopefully near future we will be trying things like multi-threading code generation, tiered compilation (running code in an interpreter first and gradually transitioning to the compiler), various changes to type inference, etc.
In the meantime there are a couple tricks to try to work around latency issues:
- Try running with -O0 or -O1
- Try running with --compile=min
- Try applying this patch:
diff --git a/base/compiler/params.jl b/base/compiler/params.jl index 8f87feb734..499b44a9f6 100644 --- a/base/compiler/params.jl +++ b/base/compiler/params.jl @@ -59,7 +59,7 @@ struct Params #=inlining, ipo_constant_propagation, aggressive_constant_propagation, inline_cost_threshold, inline_nonleaf_penalty,=# inlining, true, false, 100, 1000, #=inline_tupleret_bonus, max_methods, union_splitting, apply_union_enum=# - 400, 4, 4, 8, + 400, 1, 4, 8, #=tupletype_depth, tuple_splat=# 3, 32) end
which can significantly cut inference time.
Perhaps an alternative take on this is “compiling absolutely everything instantly is really, really hard, so it should be as easy as possible to ‘keep’ compiled code”.
Perhaps it would make sense to prioritize PackageCompiler more? Things have certainly improved but the process of swapping system images is pretty fiddly right now and, at least in my experience, thiings often go wrong when trying to use PackageCompiler. It seems to me that if there were a really slick stdlib PackageCompiler, where people could simply call
load_image (or something similarly simple), the bulk of everyone’s plotting stuff would already be compiled anyway: you’d just put a
load_image into your
startup.jl and forget about it.
I don’t really know what I’m talking about, but making compilation blazing fast sounds like an incredibly difficult problem to me, period. Stashing compiled code on the other hand, seems like it should be far easier.
In a bliss world, a
superprecompile command in the package manager would run
PackageCompiler on my current environment, and then whenever I switch environment and there is a custom sysimage for that environment it would be used automatically. Bonus points if I could configure the package manager so that it runs
superprecompile automatically whenever I make a change to my current environment (like
add or something like that). The key would really be that the package manager manages these sysimages for me.
I’m pretty positive that would go a long way to solve the main pain points I have right now with latency.
I am looking forward to further compiler improvements, but I am wondering if the issue is exacerbated with the setup that Plots.jl uses (backends loaded on demand). If yes, I am not sure that solving this is a better use of time than, say, implementing a native Julia plot library from scratch.
I’m all for making it easier to build and use system images, but many people incrementally develop and change package code, necessitating recompilation at some point. Better ways to cache native code is something we will look at though.
An auto environment-to-sysimg converter is a great idea. Of course, the package manager is itself written in julia, so we’d either need to pass system images to julia manually, or else load a minimal system image, call the package manager to find the right real image, and then restart.
I think a combination of Revise and a compiled system image could be very powerful… You compile the diffs with Revise on startup, until that takes long enough to justify a system image rebuild
I do that as well, but, as @sdanisch pointed out, having a reasonable “base image” and using Revise is a procedure that works pretty well. If I’m developing something, and need to make lots of plots during the development, if I have a nice compiled image with all the useful plotting stuff in there, it makes it far less painful for me to develop… Even if I have to restart the REPL, it probably is going to take a lot less time for me to compile my package from the base image than if I had to compile it plus everything else I wanted to use. Indeed, Revise makes it so that I don’t have to leave the REPL very often (even now I do it more for paranoia than anything else).
I can’t speak for everyone, but the thing that presents a difficulty for me at the moment is lacking a reasonable base image with all the stuff that I’m using but not necessarily working on directly. Recompiling the packages I’m working on just isn’t all that bad.
Have you tried building a system image with
using statements in base/userimg.jl?
Not recently, I’ve been using PackageCompiler. It mostly works nicely, that’s kind of my point, we are almost there! I think it just needs to be a little easier to use (particular to manage system images somehow) and a little more stable and it can become a regular part of everyone’s workflow.
I think good PackageCompiler experience would also need community effort. Some complex packages need to test themselves with PackageCompiler. I recently made PyCall PackageCompiler-compatible (https://github.com/JuliaPy/PyCall.jl/pull/651) but it was not super easy since PyCall has highly non-trivial
__init__ etc. Packages with mutable global states is harder to make it work with PackageCompiler. Packages using
ccall heavy also needs testing with PackageCompiler (the bonus point is that you may be able to catch
ccall misuse https://github.com/JuliaLang/julia/issues/31473). Also, I think such complex packages need to have CI to detect regression of their own code, PackageCompiler, or Julia.
From my point of view solving the latency problem for non-package developers first and worrying about package developers later would be an entirely reasonable strategy.
I consistently run into the biggest problems when I bring new users to julia with this latency issue. They come from R or Python, and they just don’t see why they should wait so long for their first plot. 95% of those users will never, ever develop a package. The ones that will eventually, will at that point have seen all the other benefits of julia and I think will be ok to make a compromise on the compile time. But I really strongly believe that the biggest problem right now is for “normal” users that just want to use julia to get some science done.
Here is another way to think about this: for package devs, how would the experience look on other platforms? As far as I can tell, even with the compile latencies, julia is by far one of the smoothest environments for developing packages. Heck, we are competing with other platforms where you have to run a C compiler if you want to produce a fast package. So from my point of view, some latency issues for package devs are really not the end of the world.
But for end users they are, because they can just do their plot and data analysis in R or Python, and it will most likely be faster for them.
And I think a clever combination of sysimages that are integrated into the package manager and operate on a per environment level could essentially solve this 90% for the “casual” user.
I think a reasonable approach could be to have in Project.toml (or Manifest?) a list of packages that should be integrated into the system image. Then one could selectively add or remove packages to the list of packages to be compiled. Since the Pkg manager knows, when it makes an update, it could then trigger a recompile so that nothing gets out of sync. When dev-ing a package it would automatically be removed from the compile-list.
My only concern (but maybe that can be sorted out) is that currently not all packages are compilable. At least I always run into trouble when trying out PackageCompiler (e.g. on Gtk.jl).
Sure, that is all completely fine and true. I’m in no way arguing that users shouldn’t have bigger precompiled system images to reduce latency for their needs. I’m just responding to this post: Roadmap for a faster time-to-first-plot? which specifically called out different kinds of latency.
AFAICT, all you need to do is put
using Plots (etc.) in userimg.jl, and perhaps some tooling to make that even easier. That is of course a totally different kind of thing from making the compiler faster, and can be worked on in parallel.
When I was reading this post, this was not the conclusion I was expecting. Everything before this seemed to me to support tiered compilation. The thing is that even with fully pre-compiled packages and environments, it’s really hard (impossible, actually) to predict what a user is going to do, which means that you cannot realistically have it all compiled in advance. You could reasonably do that for an application that you can trace beforehand, but not for arbitrary user input. And especially a first-time user isn’t going to be in a position to run code and create a trace or even make an environment before wanting to plot something or load a data frame. As far as I can tell, the only strategy that will provide the experience you describe of interpreter-like latency for first-time users is using an interpreter and doing compilation in the background.
I might be wrong, of course. But my sense is that for packages like DataFrame.jl, various plotting packages, file loading packages etc. one can essentially (as the package author) specify which methods should be baked into a sysimage, and by that cover a huge, huge number of use cases. Yes, it won’t work with arbitrary custom types, but heck, most users are loading CSV files with like at most four different types. Same for plotting packages etc. It won’t solve this problem completely, but I think it could go a very long way.