Is there a way to completely skip internal type inference for functions?
If I have tons of high-level API functions in a package and only a couple of performance-critical functions, I would like the top part to be like Python and the bottom part like C. Basically, in my view every bit that the compiler spends on inferring anything about the top-level functions is a waste of time, because they are called so rarely. But that inference time will be added to any CI run, any time you load the package, run a script with it, etc. That adds up.
I have heard about the following constructs but it’s hard to find information on how much they help with compile / inference time, because most people in the Julia community care about function speed from the second time on, but I care mostly about one-shot speed (at least where it’s appropriate):
@nospecialize: this supposedly keeps functions from being compiled for different input arguments. But it never felt to me like this completely did away with compilation latency
the new @optlevel macro. I guess this lowers the compilation optimization level so that it takes less time to compile, but I want basically no time to compile
JuliaInterpreter: This sounds like it would come closest to what I mean by “Python at the top level”, but I never saw much usage of this outside of the debugger and I think it needs to be compiled itself, offsetting the advantage I’d have by using it.
Annotating return types: I think this currently doesn’t help at all, but wouldn’t it hypothetically be possible to completely skip the internal inference if the user constrains the return type and accepts that the function won’t be inlineable?
TLDR: How do I get the minimum time to using / first execution while gladly sacrificing as much runtime speed as needed?
This is too broad as you note, because it will affect all code and not just what I deem performance uncritical.
Somewhat related in 1.5 is
You probably overlooked that I mentioned @optlevel in the list of things that I’m aware of. Still, as far as I understand, the full inference is run with this macro, even if the code then goes through fewer LLVM optimization passes. Skipping inference altogether would be more interesting.
Have you tried SnoopCompile? It can assist you by setting up scripts that allow you to save the results of inference by precompiling methods for specific input types. (It doesn’t save LLVM time, unfortunately.) The effectiveness varies, but especially on 1.6-DEV it’s getting increasingly useful. (There are Julia PRs queued that will make it even better in the future, and in general I expect steady progress throughout the Julia-1.6 development cycle.)
And yes, your idea to use JuliaInterpreter is good. You should be able to specify @interpret foo(...) after first push!ing any methods that require compilation onto JuliaInterpreter.compiled_methods. But as you say, you’ll pay the startup time for JuliaInterpreter. (It’s considerably faster on 1.6, but it still takes time.)
In the long run, I think a good vision is to move chunks of JuliaInterpreter into Base, or rather rewrite the interpreter that’s already in Julia itself so that it can serve as the foundation of anything JuliaInterpreter needs to do. If you’re interested in this question and want to help out, probably the most urgent issue is https://github.com/JuliaDebug/JuliaInterpreter.jl/pull/309, since there are several good reasons to try to increase performance and pioneering the new approach outside Base seems to make the most sense.
I have not really used SnoopCompile, but have seen many mentions here and elsewhere. It seems very useful, but I dislike any solution that requires a lot of setup. This is definitely a side effect from constantly recompiling Makie.jl and friends while developing for that ecosystem, because it is one scenario under which the amortization of the compile latency through runtime savings doesn’t work. Another one, as mentioned, is CI, and script-like use. Makie’s CI runs 18 minutes or so on gitlab, it’s quite horrible
a good vision is to move chunks of JuliaInterpreter into Base, or rather rewrite the interpreter that’s already in Julia itself so that it can serve as the foundation of anything JuliaInterpreter needs to do
I think this is where the real solution lies. The interpreter must probably be baked into the default Julia sysimage so that it’s not a question of preparing a specific setup. (As a side note, I think PackageCompiler is the epitome of solutions with too much up-front effort, so much work to reduce start times by a couple of seconds, and impossible to maintain in a fast changing environment). Then developers could mark whole sections of their code with @interpret_this. I will look into the issue you linked!