The outline from Stefan’s post linked above is still broadly accurate. Our priority at the moment is multithreading, and as soon as that mostly works we will return to focusing on latency. Looking at the timeline of all this, I can’t help but agree that progress has been slower than I expected.
Latency is a difficult, multi-faceted issue. You are quite right to specify which particular kinds of latency matter to you, because there are a few separate, mostly-unrelated sources of it: (1) package loading (consisting mostly of method table merging, and a bit of re-compilation), (2) the general speed of type inference, (3) type inference bugs or quasi-bugs that cause it to run an exceptionally long time, (4) front end (parsing and lowering; not the biggest issue right now), and (5) LLVM optimizations. Again, these are all mostly unrelated and different packages or workflows can hit different ones.
While there have been a few modest commits to master that chip away at this, there is an iceberg underneath of things we have tried, experiments run, and of course more things we are planning to try. Some things we try don’t work, or have no effect, or have a much smaller effect than hoped. Some things give nice improvements, at the expense of e.g. worse type information. Because of this it is very hard to promise “X% improvement by Y date”. (Side note: in case anybody still doesn’t believe that
return_type is a bad idea, using it means we can’t speed up the compiler without breaking your code. )
In the hopefully near future we will be trying things like multi-threading code generation, tiered compilation (running code in an interpreter first and gradually transitioning to the compiler), various changes to type inference, etc.
In the meantime there are a couple tricks to try to work around latency issues:
- Try running with -O0 or -O1
- Try running with --compile=min
- Try applying this patch:
diff --git a/base/compiler/params.jl b/base/compiler/params.jl
index 8f87feb734..499b44a9f6 100644
@@ -59,7 +59,7 @@ struct Params
#=inlining, ipo_constant_propagation, aggressive_constant_propagation, inline_cost_threshold, inline_nonleaf_penalty,=#
inlining, true, false, 100, 1000,
#=inline_tupleret_bonus, max_methods, union_splitting, apply_union_enum=#
- 400, 4, 4, 8,
+ 400, 1, 4, 8,
which can significantly cut inference time.