Bad performance for dispatch-heavy code

I’m trying to write a compiler in Julia, and I’m getting very bad performance. Because it’s a multi-stage compiler, most of the code is just the same function for different types of ast.

I made sure most of the structs and methods deal with concrete types, i.e. not abstract, but that doesn’t seem to help.

I ran the profiler, and it looks like the vast majority of the time is spent inside Julia’s type inference. Here’s the profiler output:

julia> Profile.print(format=:flat, sortedby=:count)
... (removed the top, because it's a huge list) ...
  1761 C:\code\preql\src\main.jl                                                                                                    461 eval_ast(::main.State, ::ast.Projection)
  1844 .\compiler\optimize.jl                                                                                                       169 optimize(::Core.Compiler.OptimizationState, ::Any)
  1899 .\compiler\typeinfer.jl                                                                                                       33 typeinf(::Core.Compiler.InferenceState)
  3039 .\compiler\typeinfer.jl                                                                                                      568 typeinf_ext(::Core.MethodInstance, ::Core.Compiler.Params)
  3050 .\compiler\typeinfer.jl                                                                                                      599 typeinf_ext(::Core.MethodInstance, ::UInt64)
  3316 C:\code\preql\src\main.jl                                                                                                    516 eval_pql(::main.State, ::String)
  3729 .\compiler\abstractinterpretation.jl                                                                                        1160 typeinf_local(::Core.Compiler.InferenceState)
  6096 .\Base.jl                                                                                                                     31 include(::Module, ::String)
  6096 .\boot.jl                                                                                                                    328 include
  6096 .\client.jl                                                                                                                  295 exec_options(::Base.JLOptions)
  6096 .\loading.jl                                                                                                                1094 include_relative(::Module, ::String)
  6097 .\client.jl                                                                                                                  464 _start()
 11926 .\compiler\abstractinterpretation.jl                                                                                        1174 typeinf_local(::Core.Compiler.InferenceState)
 13326 .\compiler\typeinfer.jl                                                                                                      482 typeinf_edge(::Method, ::Any, ::Core.SimpleVector, ::Core.Compi...
 13484 .\compiler\abstractinterpretation.jl                                                                                         376 abstract_call_method(::Method, ::Any, ::Core.SimpleVector, ::Bo...
 13506 .\compiler\abstractinterpretation.jl                                                                                          93 abstract_call_gf_by_type(::Any, ::Array{Any,1}, ::Any, ::Core.C...
 14882 .\compiler\abstractinterpretation.jl                                                                                         818 abstract_call(::Any, ::Array{Any,1}, ::Array{Any,1}, ::Array{An...
 15089 .\compiler\abstractinterpretation.jl                                                                                         847 abstract_eval_call(::Array{Any,1}, ::Array{Any,1}, ::Array{Any,...
 15090 .\compiler\abstractinterpretation.jl                                                                                         608 abstract_call(::Any, ::Array{Any,1}, ::Array{Any,1}, ::Array{An...
 15625 .\compiler\abstractinterpretation.jl                                                                                         917 abstract_eval(::Any, ::Array{Any,1}, ::Core.Compiler.InferenceS...
 15694 .\compiler\abstractinterpretation.jl                                                                                        1230 typeinf_nocycle(::Core.Compiler.InferenceState)
 15712 .\compiler\typeinfer.jl                                                                                                       12 typeinf(::Core.Compiler.InferenceState)

Why is so much of the time spent in type inference? What approach can I take to fix it?

Again, I’m mostly using non-abstract types, and I’m not even using generics, or any kind of specialized type arithmetic. Most of what I’m doing, I believe, can be easily replicated by a standard OO language.

Is it something I’m doing, or is this a known issue in Julia?


1 Like

You can start julia with --trace-compile=stderr to see some of the methods that are getting compiled. Maybe that can give you an idea of what is happening.

Would be easier to say something if there was some runnable code. Otherwise it is mostly guesswork.

1 Like

Once your compiler is compiled, it shouldn’t need a lot of inference. Are you getting this even on the 2nd or later invocation in the same session?

1 Like

Well, it’s definitely mostly about compilation. I tried running the same code twice in the same session, and the first run took 15 seconds, and the second run took only 1 second.

I ran it with --trace-compile=stderr, and got a very long and confusing list of functions. Is there any way to know how long did each one take to compile? And what line triggered its compilation? What can I do to speed up the initial compilation?

@snoopi from SnoopCompile will tell you the top-level call to inference (which is the main cost of compilation). You can try explicitly precompiling specific calls, see e.g., which gets called during module definition from here.
I should say that precompilation doesn’t always “work” because of current limitations in the serialization format of *.ji files, but that is expected to change someday (

There’s also PkgCompiler but that may not be so helpful for a package that you’re working on extensively (you have to build Julia itself along with your package, and that is slow).

Reducing compile time is at the top of the priority list now (Compiler work priorities).

1 Like

Thanks, I’ll look into SnoopCompile.

I’m not sure I see the point of precompilation… unless there is a way to cache it? (i.e. store and load it from a file).

Reducing compile time is at the top of the priority list now

That’s good news! Is there any timeline for it? Anything that’s planned for the near future?

I think that for my workflow, caching compiled methods would be a huge boost to productivity. Doing so automatically, even just by the module hash (so nothing too fancy), would solve most of my performance issues.

I’m not sure I see the point of precompilation… unless there is a way to cache it? (i.e. store and load it from a file).

Yes. When you make changes to your compiler package and start a new Julia session and type using MyAwesomeCompiler, do you see a message like Precompiling MyAwesomeCompiler? That’s writing a *.ji file in your ~/.julia/compiled directory.

I see. So please excuse my newbie question, but why does calling

$ julia MyMediocreCompiler.jl

And then immediately

$ julia MyMediocreCompiler.jl

Have the same performance? Can’t it load the ji file and avoid most of the compilation?

It’s really hard to guess exactly what you’re doing. If you’re invoking it from the linux prompt as a single file, I’m guessing you’re defining all the code and then the test/demo/whatever case in the same file. In that case no *.ji file is defined. (Do you see one in the ~/.julia/compiled directory?)

If you’re doing something as sophisticated as writing a compiler, you should do something as trivial as designing your compiler as a stand-alone “package,” even if the only person who will ever use it is you. Use PkgTemplates to create a skeleton. Once it’s a package, you load the code with using MyMediocreCompiler. That part can get precompiled. Then include("test/compiler_demos.jl") is not precompiled, but it will be faster the second time around.

You probably also want to switch to a workflow where you leave the Julia session open and use Revise, so you can make changes and run your test script again.

1 Like

Yes, sorry, I didn’t give enough information.

I’m running julia my_test.jl

Which in it has

using main: init_state, clone_state, parse_file, exec_pql, eval_pql, exec, dump_table

(no include)

I can find main.ji in the precompiled library, right now it says it was last written an hour ago.

The test package is very light on logic, so it seems like Julia is ignoring the precompiled code?

I tried using Revise, but it doesn’t suite my purposes. I need to make changes to structs sometimes!

I would be fine if there was a way to fully recompile an entire module and all its dependencies within a session, but I didn’t find a way to do it, other than manually re-include-ing everything in the right order.

Then you can cache inference results in the file. Caching happens when the package is built, i.e., at that final end that closes the module. Only code that has already been called (in a clean Julia session) makes it in to the *.ji file, which is typically almost nothing. If I’ve defined

foo(x) = 2x

in my package, how is the compiler to know you’re planning on calling it for x::Float16? That’s where precompile comes in: it forces compilation before close-of-module-definition.

I tried using Revise, but it doesn’t suite my purposes. I need to make changes to structs sometimes!

True. But sometimes you don’t, right?

1 Like

I think I understand. So Julia is just missing the functionality to append to these ji files, after they’ve been closed? (in “run-time”)

That’s interesting. It feels like an easy problem to fix, and a fix that would really benefit everyone, and not just newcomers like me.

Do you have any idea if this is something they’re aiming for? And if not, why?

True. But sometimes you don’t, right?

I’m wary of partial reloads, but maybe I’ll give it a try.

Thank you, btw, for taking the time to reply, and being so helpful and prompt. I appreciate it!

1 Like

Totally agreed about the huge potential benefits. The way forward might be, why don’t you check out the diff and then use it to start getting to know src/dump.c. I could use some help!