What is included in the output of `--trace-compile`?

I’m playing around a bit with the new --trace-compile-timing flag for julia 1.12, and while I am excited about this feature, there are a couple questions I’m having for which I can’t seem to easily find an answer.

The setup is this:
I opened the folder of one of my packages (MPSKit.jl in this case), which has been precompiled, and ran a single function, just to see what the output was like. So basically:

julia --startup=no --project --trace-compile output.log --trace-compile-timing -e 'using MPSKit; InfiniteMPS(2, 12)'

This is a function where a bunch of things are going on, and since there currently is no precompilation setup for MPSKit, nor for its largest dependency (TensorKit.jl), I’m expecting this to take some time, and want to know where to focus my efforts to decrease this.

However, when I inspected the file, it seems like a bunch of functions that I know are in the callstacks, are not showing up in the file. In particular, there don’t seem to be that many functions of TensorKit or TensorOperations included, so I am wondering what actually is in there and what isn’t. Does this have something to do with inlining? Are functions that don’t take up enough time left from the list? Is there any specific documentation about this kind of behavior that I might read?

Additionally, I was wondering how to read the timings. Are these numbers the time it takes to compile that function and all of its callees, (excluding what has already been precompiled), or just the time for that function itself? If the former, is there a way to obtain a more “tree-like” output, or should I then look into SnoopCompile.jl and friends?

Thanks in advance for any pointers and help!

1 Like

I think it makes sense that inlined functions wouldn’t show up. In the extreme, a function that is inlined may be completely eliminated in subsequent compiler transformations. So you don’t get the function itself compiled as a stand-alone thing… When you call that function, you have to start from scratch. And also vice versa, if you already compiled a function once, if it shows up in some other code and is inlined and transformed, the original compiled code cannot, at least in general, be reused.

Those are just educated guesses though. I’m also curious to hear a more authoritative answer.

--trace-compile-timing works the same way as --trace-compile: you get one line of output for each time compilation was triggered, and no output for compilation that happens as a direct result of that. The time includes the time for all the type inference and compilation that was necessary before the method was ready to call.

You can probably predict what will happen here:

$ julia --trace-compile=stderr --trace-compile-timing -e 'foo(x) = x+1; bar(x) = foo(x)*2; foo(1); bar(2)'
#=    1.6 ms =# precompile(Tuple{typeof(Main.foo), Int64})
#=    1.2 ms =# precompile(Tuple{typeof(Main.bar), Int64})

In this case, foo gets inlined into bar, so the standalone version of foo doesn’t get compiled until it is called afterwards.

$ julia --trace-compile=stderr --trace-compile-timing -e 'foo(x) = x+1; bar(x) = foo(x)*2; bar(2); foo(1)'
#=    1.7 ms =# precompile(Tuple{typeof(Main.bar), Int64})
#=    1.4 ms =# precompile(Tuple{typeof(Main.foo), Int64})

If I mark foo as @noinline, compiling bar causes the standalone version of bar to get compiled, so when I call it later it doesn’t produce a line of output.

$ julia --trace-compile=stderr --trace-compile-timing -e '@noinline foo(x) = x+1; bar(x) = foo(x)*2; bar(2); foo(1)'
#=    2.5 ms =# precompile(Tuple{typeof(Main.bar), Int64})

If you want to see the time spent in a tree, SnoopCompile.jl is what you want. If you want a really detailed look at exactly what Julia was doing during compilation, you can build Julia with profiling support.

1 Like