I have some code for a research project of mine that takes quite a long time to compile. Here is a recent example
@time generateMnli(Pk, bd22, 0.0)
@time generateMnli(Pk, bd22, 0.0)
434.282438 seconds (502.49 M allocations: 23.516 GiB, 3.86% gc time)
0.393256 seconds (990.98 k allocations: 57.412 MiB, 4.19% gc time)
The code is quite lengthy and a longish compilation time is not unreasonable, however the current situation makes the code very difficult for me to recommend other people use it. I would like to either significantly reduce the compilation time or figure out some solution to only pay this compile cost once (like when a package is first installed).
The first bit of the code looks like
function generateMnli(Pk::Function, BiasDict22::Dict{String,F}, f::Real) where {F<:Real}
M = OffsetArray{Array{Float64}}(undef, 0:4, 0:8, 0:8)
b1 = BiasDict22["b1"]
bη = BiasDict22["bη"]
b2 = BiasDict22["b2"]
bK² = BiasDict22["bK2"]
bδη = BiasDict22["bδη"]
bη² = BiasDict22["bη2"]
bKKpara = BiasDict22["bKK∥"]
bΠ2para = BiasDict22["bΠ2∥"]
r, xi20 = ξ(Pk,2,0)
_, xi00 = ξ(Pk,0,0)
_, xi1m1 = ξ(Pk,1,-1)
where ξ
is a wrapper around a function from another package that returns two arrays of Float64
(so each variable like xi00
is a length 1024 or so array). This wrapper function is used in quite a few other places and those don’t run into compile problems. The reason that this function takes so long to compile is due to statements like the following
M[0,0,2] = @. (32*(3*f*bη - 5*bΠ2para)*(147*b1*f + 182*bKKpara + 147*f*bδη + 6*f^2*bη + 6*f^2*bη² +
84*bΠ2para)*xi20^2)/972405. - (32*(3*f*bη - 5*bΠ2para)*(7*bKKpara + 4*f^2*bη + 4*f^2*bη² +
7*bΠ2para)*xi40^2)/108045. + xi20*((32*(7*bKKpara + 3*(14*b1*f + 14*f*bδη - 8*f^2*bη - 8*f^2*bη² -
7*bΠ2para))*(3*f*bη - 5*bΠ2para)*xi00)/99225. + (32*(28*bKKpara - 3*(49*b1*f + 49*f*bδη - 38*f^2*bη -
38*f^2*bη² - 42*bΠ2para))*(3*f*bη - 5*bΠ2para)*xi40)/540225.)
The above is one of the shorter ones and there are around 400 lines of such statements in the function. Of course this is probably beyond any reasonable use case so its understandable that the compiler is having issues but I’m unsure what to do to even find out what I can change to help it compile faster. There are various things I could do, take the dictionary values as arguments instead of the dictionary, calculate the ξ
functions outside the function and take them as inputs, generate the Pk function inside of this function to prevent having a function as an input, split each term up into its own function, but I’m unsure which of these have even a chance of working.
Below is the result of @code_warntype generateMnli(Pk, bd22, 0.0)
on a version reduced to essentially what is shown in this post.
An additional area that would resolve my issues is being able to pay the compilation cost once and only once. I have tried adding various precompile
calls, but none of them seem to work. I have briefly looked at PackageCompiler.jl and this seems like it would resolve the issue locally, but ideally other people would be able to just download my package and not need to setup a system image just for this code. Is something like precompile
usable for this case or should I just focus my time on reducing compilation time?
Any help would be greatly appreciated, even just ways to check the compilation process and find which lines are causing the issue (I think code_warntype might do this but I have no idea how to read the output).