Start-up performance, types and compiler's type inference

I’ve been trying to reduce the start-up time to run a small OpenGL (Modern.jl and GLFW) program.

I clearly saw that the compilation part was introducing significant overhead, much more than the actual computations. In fact, I profiled the application and saw that type inference was the main hotspot.

I read https://docs.julialang.org/en/v1/manual/modules/#Summary-of-module-usage-1 and I reduced the time from 8 seconds to 5.5. After that, I reduced it even more by using julia -O0 to reach just 4 seconds.

However, I’d love to reduce this even further.

I profiled again the application and I still see some references to typeinfer.jl. I guess that setting more types on function parameters could reduce the time spent on inference, but I’m not sure. I’m also quite lost about which pattern I should follow and which anti-patterns I should avoid to reduce this.

Note: I’m using Julia 1.0.0

Thank you!

Hard to say more without seeing the code or a minimal example. That said, I am under the impression that finalizing Julia 1.0 semantics was a priority until its release, and non-breaking compiler improvements are work in progress that will follow continuously, so maybe just waiting for them is a reasonable strategy.

No, I don’t think it will.
Julia can’t use the type constraints on called functions,
to help the type inference system (except maybe as a heuristic).
This is incontrast to say F# (and probably many other languages, but F# is one I know can do this)
I don’t think it can do it,
because calling a function that doesn’t have a method isn’t illegal julia code,
it is just code that will through a MethodError.
You can write code that depends on MethodErrors being thrown,
and catching them,
(and indeed for some really dynamic code, I have done that. Though I wouldn’t recommend.)

Anyway,
The following shows that it doesn’t change things much.
Notice the same allocations for the first call of f1 and f2

julia> f1(x::Int) = 3
f1 (generic function with 1 method)

julia> @time f1(1)
  0.000158 seconds (381 allocations: 25.891 KiB)
3

julia> @time f1(1)
  0.000003 seconds (4 allocations: 160 bytes)
3

julia> f2(x) = 3
f2 (generic function with 1 method)

julia> @time f2(1)
  0.000193 seconds (381 allocations: 25.891 KiB)
3

Some rigor

Can collect data on this using

julia> function makeandtime_free()
       fname = Base.gensym()
       fval = rand()
       @eval $fname(x) = $fval
       @eval(@elapsed $fname(1))
       end
makeandtime_free (generic function with 1 method)

julia> function makeandtime_constrained()
       fname = Base.gensym()
       fval = rand()
       @eval $fname(x::Int) = $fval
       @eval(@elapsed $fname(1))
       end
makeandtime_constrained (generic function with 1 method)

Call that in a map and then one then has a bunch of data.

Now hypothesis testing can find out if there is a statistically significant difference between the two.
I am not a statistician.

But I think we want the Unequal Variance t-Test

Perform an unequal variance two-sample t-test of the null hypothesis that x and y come from distributions with equal means against the alternative hypothesis that the distributions have different means.


julia> using HypothesisTests

julia> constrained_times = map(x->makeandtime_constrained(), 1:1000);

julia> free_times = map(x->makeandtime_free(), 1:1000);

julia> UnequalVariance(constrained_times, free_times)
UnequalVarianceTTest UnequalVarianceZTest
julia> UnequalVarianceTTest(constrained_times, free_times)
Two sample t-test (unequal variance)
------------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -3.947500500000003e-5
    95% confidence interval: (-0.0001, 0.0)

Test summary:
    outcome with 95% confidence: fail to reject h_0
    two-sided p-value:           0.1626

Details:
    number of observations:   [1000,1000]
    t-statistic:              -1.3969401508247017
    degrees of freedom:       1928.9743996051652
    empirical standard error: 2.8258193435628185e-5

So that says that we can not reject the null hypothesis.
So it is entirely reasonable that the constrained and the unconstrained function first run time (and thus inference time),
is on average the same.

6 Likes

Thank you so much for such a detailed response!

I see that setting the types may not help at all. But then, what can I do to mitigate compiling time?

I’m writing an end-user application, and even though some latency would be ok, it wouldn’t be ok to wait one or two seconds each time the user clicks a button for the first time, or having to wait for 30 seconds to start it.

I already sense that debugging is slower due to the simple fact that compilation takes time.

Of course, I understand Julia trade-offs, and, in particular, the ability to allow scripting while keeping high-performance is the main reason I’m trying to use Julia instead of a more mature language.

However, in most languages there are similar problems and there is always some kind of guidance or mitigation. For example, in C++ a proper build system won’t need to re-compile everything every time you make a change.

Thank you again!

For debugging module code, Revise.jl is great. It allows you to make changes without recompiling everything (to some extend a drop-in for a “proper build system”). Regarding interactivity problems, you may try the latest ahead-of-time compilation effort for Julia: PackageCompiler.jl. I haven’t used it myself and it seems to still have rough edges, but it has seen successful use, for instance to remove compilation time from Makie.jl.

2 Likes

Cool! Thanks! I’ll take a look at them!