Thanks. After this season is over (Iβll be traveling after lunch) I will open an issue trying to explain all that Iβve tried and results. You should now be able to automatically install GMT (unless you are also bitten by the libnetcdf/libcurl
fight.
I just share people this script, which IMO should be in the docs in some form.
using SnoopCompile
invalidations = @snoopr begin
# Your code goes here!
using OrdinaryDiffEq
function lorenz(du, u, p, t)
du[1] = 10.0(u[2] - u[1])
du[2] = u[1] * (28.0 - u[3]) - u[2]
du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
u0 = [1.0; 0.0; 0.0]
tspan = (0.0, 100.0)
prob = ODEProblem{true,false}(lorenz, u0, tspan)
alg = Rodas5()
tinf = solve(prob, alg)
end;
trees = SnoopCompile.invalidation_trees(invalidations);
@show length(SnoopCompile.uinvalidated(invalidations)) # show total invalidations
show(trees[end]) # show the most invalidated method
# Count number of children (number of invalidations per invalidated method)
n_invalidations = map(trees) do methinvs
SnoopCompile.countchildren(methinvs)
end
import Plots
Plots.plot(
1:length(trees),
n_invalidations;
markershape=:circle,
xlabel="i-th method invalidation",
label="Number of children per method invalidations"
)
Just replace the OrdinaryDiffEq code with the code you want analyzed and this will spit out an informative plot of the invalidations distribution plus what the biggest invalidating method is.
3 posts were split to a new topic: Memory usage in Julia v1.9.0-beta2
The script errors when there are no invalidations
julia> include("using_SC.jl")
length(SnoopCompile.uinvalidated(invalidations)) = 0
ERROR: LoadError: BoundsError: attempt to access 0-element Vector{SnoopCompile.MethodInvalidations} at index [0]
Use Julia v1.9.0-beta2 is fast - #17 by tim.holy instead. It shouldnβt error and it will also make your TODO list much, much smaller .
Getting an error when trying this.
Using an environment with julia-1.9.0-beta2 and only the three relevant packages installed.
julia> using SnoopCompileCore
julia> invs = @snoopr using ControlSystems;
julia> using SnoopCompile
julia> trees = invalidation_trees(invs)
ERROR: KeyError: key MethodInstance for LoopVectorization.avx_config_val(::Val{(false, 0, 0, 0, false, 0x0000000000000001, 1, true)}, ::Static.StaticInt{4}) not found
Stacktrace:
[1] getindex(h::Dict{Union{Int32, Core.MethodInstance}, Union{Tuple{Any, Vector{Any}}, SnoopCompile.InstanceNode}}, key::Core.MethodInstance)
@ Base ./dict.jl:484
[2] invalidation_trees(list::Vector{Any}; exclude_corecompiler::Bool)
@ SnoopCompile ~/.julia/packages/SnoopCompile/Q8zUg/src/invalidations.jl:403
[3] invalidation_trees(list::Vector{Any})
@ SnoopCompile ~/.julia/packages/SnoopCompile/Q8zUg/src/invalidations.jl:294
[4] top-level scope
@ REPL[4]:1
(testinv) pkg> st
Status `/tmp/testinv/Project.toml`
[a6e380b2] ControlSystems v1.5.3
[aa65fe97] SnoopCompile v2.9.5
[e2b509da] SnoopCompileCore v2.9.0
Sorry about that. I donβt yet understand the error, but in the meantime thereβs a hack-workaround real fix available if you do use SnoopCompile 2.9.6 or higher. That should at least allow you to keep moving forward, pkg> add SnoopCompile#teh/circumvent_err
with the understanding that there are no guarantees that the output is comprehensive (and if you see stuff that makes no sense, it would be worth suspecting a bug).
Iβll try to develop a clearer understanding in the next few days, but I wanted you to have something to play with.
The error has been fixed (correctly). You can get the fix with SnoopCompile 2.9.6 or higher. In case anybody is using the teh/circumvent_err
branch, Iβll try to remember to leave it lying around for a week or so, but itβs better to upgrade to the latest release.
Thought Iβd leave this here as well: I wanted to get a feel for how the user experience of the βoriginalβ TTFX - i.e. time-to-first-plot using Plots.jl - has evolved over the years, so I wrote a little benchmark script that times starting Julia and running using Plots; display(scatter(rand(1_000)))
for all Julia versions since 0.7.
Hereβs what the mean and minimum times look like across versions:
Initially I had included 0.6.4 as well but with such an old Julia version one has to also fall back onto a very old Plots version which errors out on the display
call. An alternative script with savefig
instead of display
suggests that 0.6 had similar TTFP to 0.7-1.2.
I think the plot makes clear the scale of achievement here - while Iβm sure thereβs more to come as the ecosystem starts to make best use of the new abilities of Julia, it seems clear that this is the first time that weβve seen a dramatic qualitative shift in the sort of very basic experience that the modal user might have without resorting to any tricks like PackageCompiler.
Some more boring details for those interested below:
Details
The benchmarks were run on an i7 Windows laptop, versioninfo()
is:
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 8 Γ 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, tigerlake)
Threads: 1 on 8 virtual cores
As Plots.jl has dropped compatibility for older Julia versions over time, going back in Julia versions also means going back to older Plots versions, so the benchmarks donβt just capture differences in precompilation and other language features, but changes in Plots itself.
The following table shows the Plots version used for every Julia version, as well as the average, minimum, and standard deviation of the 20 samples taken:
β String String Float64 Float64 Float64
ββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β 0.6.4 0.17.4 11.5 10.7 0.9
2 β 0.7.0 0.19.3 21.5 19.8 1.3
3 β 1.0.5 1.4.3 18.5 18.1 0.5
4 β 1.1.1 1.4.3 21.3 18.3 4.4
5 β 1.2.0 1.4.4 22.8 21.4 1.5
6 β 1.3.1 1.6.12 27.3 26.2 0.7
7 β 1.4.2 1.6.12 35.0 30.0 5.4
8 β 1.5.4 1.24.3 35.7 31.9 3.3
9 β 1.6.5 1.27.6 24.6 21.7 2.4
10 β 1.7.3 1.37.2 24.3 22.6 2.0
11 β 1.8.3 1.38.0 20.6 17.7 2.8
12 β 1.9.0b 1.38.0-dev 7.6 4.1 1.1
Note that 1.9.0b uses a branch of Plots with a fix from Kristoffer that addresses an issue with the usage of cached code on Windows.
The whole benchmark code run is:
using DataFrames, ProgressMeter
path = "path/to/Julias"
julias = Dict(
"0.6.4" => "Julia-0.6.4/bin/julia.exe",
"0.7.0" => "Julia-0.7.0/bin/julia.exe",
"1.0.5" => "Julia-1.0.5/bin/julia.exe",
"1.1.1" => "Julia-1.1.1/bin/julia.exe",
"1.2.0" => "Programs/Julia-1.2.0/bin/julia.exe",
"1.3.1" => "Julia-1.3.1/bin/julia.exe",
"1.4.2" => "Programs/Julia/Julia-1.4.2/bin/julia.exe",
"1.5.4" => "Programs/Julia 1.5.4/bin/julia.exe",
"1.6.5" => "Programs/Julia-1.6.5/bin/julia.exe",
"1.7.3" => "Programs/Julia-1.7.3/bin/julia.exe",
"1.8.3" => "Programs/Julia-1.8.3/bin/julia.exe",
"1.9.0b" => "Programs/Julia-1.9.0-beta2/bin/julia.exe",
)
jdf = sort(DataFrame(version = collect(keys(julias)), location = collect(values(julias))), :version)
# Check versions
for r β eachrow(jdf)
println("Julia version: ", r.version)
if r.version == "0.6.4"
run(`$(normpath(path, r.location)) -e "println(Pkg.installed()[\"Plots\"])"`);
else
run(`$(normpath(path, r.location)) -e "import Pkg; println(Pkg.installed()[\"Plots\"])"`);
end
println()
end
n = 20
jdf.times = [zeros(n) for _ β 1:nrow(jdf)]
# Run code
for r β eachrow(jdf)
println("Julia version: ", r.version)
pn = normpath(path, r.location)
@showprogress for i β 1:n
if r.version == "0.6.4"
r.times[i] = @elapsed run(`$pn -e "using Plots; scatter(rand(1_000))"`);
else
r.times[i] = @elapsed run(`$pn -e "using Plots; display(scatter(rand(1_000)))"`);
end
end
println()
end
using Plots, Statistics
jdf.average = mean.(jdf.times)
jdf.standard_dev = std.(jdf.times)
bar(jdf.version[2:end], jdf.average[2:end], label = "",
linewidth = 0.1, ylabel = "Time (seconds)", xlabel = "Julia version",
yticks = 0:5:50,
title = "Mean/minimum time for 20 samples\nusing Plots; display(scatter(rand(1_000)))")
scatter!((1:nrow(jdf)-1) .- 0.5, minimum.(jdf.times)[2:end], label = "Minimum", markerstrokewidth = 0.1)
jdf.Plots_version = ["0.17.4", "0.19.3", "1.4.3", "1.4.3", "1.4.4", "1.6.12",
"1.6.12", "1.24.3", "1.27.6", "1.37.2", "1.38.0", "1.38.0-dev"]
select(jdf, :version, :average => (x -> round.(x, digits = 1)) => :average,
:times => (x -> round.(minimum.(x), digits = 1)) => :minimum,
:times => (x -> round.(std.(x), digits = 1)) => :std_dev)
Thatβs really cool! If you have the time and willingness, would you be able to check DifferentialEquations?
Hereβs a one-liner that I believe should work even on ancient versions:
using DifferentialEquations; f(u,p,t) = 1.01*u; solve(ODEProblem(f,1/2,(0.0,1.0)), Tsit5(), abstol=1e-8)
Just wanted to say thanks to everybody who worked on this. On DFTK, with a simple SnoopPrecompile.@precompile_all_calls
before a standard workflow in the package, TTFX went from 40s to 8s on my machine, and from 80s to 2s on somebody elseβs (not sure why the discrepancyβ¦), which is pretty amazing. Precompilation costs did go through the roof, which is unfortunate, but we added an environment variable so precompilation can be disabled for developers.
Well that was quite the Odysseyβ¦ Best to do these kinds of experiments with packages that take less than 10 minutes to install and precompile (I think on 0.6.4 it took 20 minutes just to download all the dependencies). I had also forgotten how annoying precompilation was before parallel precompilation and progress bars - just staring at a terminal with nothing happening for minutes on end isnβt fun.
And after all that the results are⦠not excatly overwhelming
Note that I couldnβt get the package to precompule on 1.4 and 1.5, so I skipped those Julia versions.
Here we are covering 4 major DiffEq releases (from 4.5 to 7.6), so I assume we are comparing across some pretty major changes to the package itself, and during installation I noticed that later versions pull in a lot of heavy dependencies like LoopVectorization. On 1.9-beta2 it looks like virtually all the time is spent in using DifferentialEquations
, so maybe thereβs not a lot to be gained on the actual first solve - but I donβt really know anything about DifferentialEquations
so Iβm not the right person to make sense of these results!
Here are the package versions used and summary statistics of the 20 samples for each Julia version:
Row β version DiffEq_version average minimum std_dev
β String String Float64 Float64 Float64
ββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β 0.6.4 4.5.0 7.1 6.8 0.4
2 β 0.7.0 5.2.1 14.3 11.1 1.8
3 β 1.0.5 6.10.1 25.9 20.5 2.4
4 β 1.1.1 6.10.1 31.2 29.1 2.0
5 β 1.2.0 6.11.0 32.1 24.5 4.7
6 β 1.3.1 6.18.0 32.3 26.7 4.7
7 β 1.6.5 7.6.0 12.8 11.4 1.4
8 β 1.7.3 7.6.0 10.4 9.5 0.7
9 β 1.8.3 7.6.0 17.7 17.3 0.4
10 β 1.9.0b 7.6.0 10.7 9.9 0.8
Full code
using DataFrames, ProgressMeter
path = "path/to/julias"
julias = Dict(
"0.6.4" => "Julia-0.6.4/bin/julia.exe",
"0.7.0" => "Julia-0.7.0/bin/julia.exe",
"1.0.5" => "Julia-1.0.5/bin/julia.exe",
"1.1.1" => "Julia-1.1.1/bin/julia.exe",
"1.2.0" => "Programs/Julia-1.2.0/bin/julia.exe",
"1.3.1" => "Julia-1.3.1/bin/julia.exe",
#"1.4.2" => "Programs/Julia/Julia-1.4.2/bin/julia.exe",
#"1.5.4" => "Programs/Julia 1.5.4/bin/julia.exe",
"1.6.5" => "Programs/Julia-1.6.5/bin/julia.exe",
"1.7.3" => "Programs/Julia-1.7.3/bin/julia.exe",
"1.8.3" => "Programs/Julia-1.8.3/bin/julia.exe",
"1.9.0b" => "Programs/Julia-1.9.0-beta2/bin/julia.exe",
)
jdf = sort(DataFrame(version = collect(keys(julias)), location = collect(values(julias))), :version)
# Check versions
for r β eachrow(jdf)
println("Julia version: ", r.version)
if r.version == "0.6.4"
run(`$(normpath(path, r.location)) -e "println(Pkg.installed()[\"DifferentialEquations\"])"`);
else
run(`$(normpath(path, r.location)) -e "import Pkg; println(Pkg.installed()[\"DifferentialEquations\"])"`);
end
println()
end
n = 20
jdf.times = [zeros(n) for _ β 1:nrow(jdf)]
# Run code
for r β eachrow(jdf)
println("Julia version: ", r.version)
pn = normpath(path, r.location)
@showprogress for i β 1:n
r.times[i] = @elapsed run(`$pn -e "using DifferentialEquations; f(u,p,t) = 1.01*u; solve(ODEProblem(f,1/2,(0.0,1.0)), Tsit5(), abstol=1e-8)"`);
end
println()
end
using Plots, Statistics
jdf.average = mean.(jdf.times)
jdf.standard_dev = std.(jdf.times)
bar(jdf.version, jdf.average, label = "",
linewidth = 0.1, ylabel = "Time (seconds)", xlabel = "Julia version",
yticks = 0:5:50,
title = "Mean/minimum time for 20 samples\nTime-to-first-DiffEq solve")
scatter!((1:nrow(jdf)) .- 0.5, minimum.(jdf.times), label = "Minimum", markerstrokewidth = 0.1)
jdf.DiffEq_version = ["4.5.0", "5.2.1", "6.10.1", "6.10.1", "6.11.0", "6.18.0",
"7.6.0", "7.6.0", "7.6.0", "7.6.0"]
print(select(jdf, :version, :DiffEq_version, :average => (x -> round.(x, digits = 1)) => :average,
:times => (x -> round.(minimum.(x), digits = 1)) => :minimum,
:times => (x -> round.(std.(x), digits = 1)) => :std_dev))
We should also keep in mind that the amazing new pkgimg caching of native code solves only βtime to first X after things are already imported when invalidations are fixed by package developersβ. Developers separately are solving the invalidation issues (which had started to pay off even before pkgimages). However, there is still a lot of work that can be done on the βimport timeβ side of things. The plots above show βimport time + time to first X after importβ which can be confusing because:
- pkgimages (and fixing invalidations) turn βtime to first X after importβ from many seconds to milliseconds
- pkgimages do not change βimport-only timeβ much
Nonetheless, this is an amazing step forward, with many other exciting ideas being rumored on offhand comments on github.
This is a very cool graph. I think this subject should have its own blog post in addition to being mentioned in the highlights post and plots like this would be excellent to show the trajectory over time. I can add some interpretation here:
- 0.7 versus 1.0: The 1.0 release deleted a lot of deprecation code, making it significantly leaner and faster than 0.7.
- 1.0 through 1.5: The compiler got smarter and generated faster code, but at the expense of worsening compile times. Some compensatory latency reductions were done in this era, but they were outpaced by improved compiler output taking more time (and LLVM getting slower).
- 1.6 through 1.8: Incremental improvements to compile time due to focus on reducing latency in both Julia and LLVM. We were always acutely aware of this, and people started shaming LLVM for its compile times; LLVM added compile time benchmarks and now tracks them and has been improving somewhat.
- 1.9: Machine code caching is transformative at significantly reducing TTFP. Now that it works, more can be done here: packages can precompile more now that itβs more worthwhile to do so; fixing invalidations now has larger proportional effect; more effort can be made on reducing the remaining bottlenecks in load time.
Notable is that second two phases retain and even build on the runtime performance improvements gained in the previous era so itβs a win-win: much improved load latency and much improved runtime performance.
Oh, and we might see big improvements in load / compile time from the newly landed βweak dependenciesβ functionality, which can reduce the number of dependencies that packages have considerably by allowing them to convert some regular deps to weak deps so that you donβt have to pay all the time on the off chance that you might use them. That requires more of an ecosystem wide movement though.
Iβd also add all the method invalidation hunting that happened during this period, happening in a concerted manner across the entire ecosystem, with new package versions that are dependent upon work in Julia itself. The best compile time improvements are compilations that donβt need to happen!
I think thatβs quite interesting, thanks! (I should have told you to use only OrdinaryDiffEq.jl
, that would have had way less dependencies, sorry about that!)
I promise Iβll stop now, but one thing I was interested in was a simple βdata sciencyβ task - read a CSV file into a DataFrame. This is another classic where traditionally Juliaβs performance was around best-in-class (with multithreaded CSV parsing being competitive with the best parsers in other languages, and DataFrames doing very well on the h2o benchmarks when they were still around), but the benchmarks always felt a bit disingenuous: of course itβs great to be able to read a csv in 3 seconds which might take other languages 10 seconds, but thatβs not much help if you add on 10+ seconds of TTFX given that one frequently only wants to read a single csv file, once.
Happily, this problem is also mostly solved in 1.9:
Here Iβm reading in a 1,000,000 row, 6 column (one string, five float) csv file and turning it into a DataFrame (which is DataFrame(CSV.File("testcsv.csv"))
for the most part, except for Julia 0.6 when CSV still had DataFrames as a dependency and the API was CSV.read("testcsv.csv")
).
The file Iβve picked here is small enough that any improvements in parsing speed in CSV itself over time are unlikely to matter - on my machine the file is read in less than 0.5s using the latest CSV version (and 0.1s when starting Julia with more threads, although that seems to increase the TTFX by a second or so so on a one-shot benchmark the gains are negated - the benchmarks are run single-threaded).
I havenβt tested using
vs. executing the actual DataFrame(CSV.File())
call separately across versions, but in 1.9b it seems the <4s break down into a bit over 2s for using CSV, DataFrames
and around 1.5s for reading the csv and creating the DataFrame, which again for all intents and purposes feels quite instantaneous for loading a csv file.
More details on package versions used and timings, as well as benchmark code run, in the section below.
Details
Detailed results by Julia version:
Row β version CSV_version DataFrames_version average minimum std_dev
β String String String Float64 Float64 Float64
ββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β 0.6.4 0.2.5 0.11.7 10.4 10.1 0.4
2 β 0.7.0 0.4.3 0.17.1 10.8 10.5 0.4
3 β 1.0.5 0.8.5 1.3.6 14.0 12.4 0.8
4 β 1.1.1 0.8.5 1.3.6 14.8 13.7 0.9
5 β 1.2.0 0.8.5 1.3.6 15.2 13.3 1.4
6 β 1.3.1 0.10.4 1.3.6 20.5 18.6 2.1
7 β 1.4.2 0.10.4 1.3.6 27.2 22.2 2.6
8 β 1.5.4 0.10.4 1.3.6 22.0 19.8 0.7
9 β 1.6.5 0.10.9 1.3.3 12.3 11.6 0.4
10 β 1.7.3 0.10.9 1.3.4 13.7 12.3 0.6
11 β 1.8.3 0.10.9 1.4.4 17.0 16.4 0.3
12 β 1.9.0b 0.10.9 1.4.4 4.1 3.7 0.3
Benchmark code run:
using DataFrames, ProgressMeter
path = "path/to/julias"
julias = Dict(
"0.6.4" => "Julia-0.6.4/bin/julia.exe",
"0.7.0" => "Julia-0.7.0/bin/julia.exe",
"1.0.5" => "Julia-1.0.5/bin/julia.exe",
"1.1.1" => "Julia-1.1.1/bin/julia.exe",
"1.2.0" => "Programs/Julia-1.2.0/bin/julia.exe",
"1.3.1" => "Julia-1.3.1/bin/julia.exe",
"1.4.2" => "Programs/Julia/Julia-1.4.2/bin/julia.exe",
"1.5.4" => "Programs/Julia 1.5.4/bin/julia.exe",
"1.6.5" => "Programs/Julia-1.6.5/bin/julia.exe",
"1.7.3" => "Programs/Julia-1.7.3/bin/julia.exe",
"1.8.3" => "Programs/Julia-1.8.3/bin/julia.exe",
"1.9.0b" => "Programs/Julia-1.9.0-beta2/bin/julia.exe",
)
jdf = sort(DataFrame(version = collect(keys(julias)), location = collect(values(julias))), :version)
# Check versions
jdf.CSV_version .= ""
jdf.DataFrames_version .= ""
for r β eachrow(jdf)
println("Julia version: ", r.version)
pn = normpath(path, r.location)
if r.version == "0.6.4"
r.CSV_version = readchomp(`$pn -e "println(Pkg.installed()[\"CSV\"])"`);
r.DataFrames_version = readchomp(`$pn -e "println(Pkg.installed()[\"DataFrames\"])"`);
else
r.CSV_version = readchomp(`$pn -e "import Pkg; println(Pkg.installed()[\"CSV\"])"`);
r.DataFrames_version = readchomp(`$pn -e "import Pkg; println(Pkg.installed()[\"DataFrames\"])"`);
end
println()
end
n = 20
jdf.times = [zeros(n) for _ β 1:nrow(jdf)]
testcsv = "\"/Documents/testcsv.csv\""
# Run code
for r β eachrow(jdf)
println("Julia version: ", r.version)
pn = normpath(path, r.location)
@showprogress for i β 1:n
if r.version == "0.6.4"
r.times[i] = @elapsed run(`$pn -e "using DataFrames, CSV; CSV.read($testcsv);"`);
else
r.times[i] = @elapsed run(`$pn -e "using DataFrames, CSV; DataFrame(CSV.File($testcsv));"`);
end
end
println()
end
using Plots, Statistics
jdf.average = mean.(jdf.times)
jdf.standard_dev = std.(jdf.times)
bar(jdf.version, jdf.average, label = "",
linewidth = 0.1, ylabel = "Time (seconds)", xlabel = "Julia version",
yticks = 0:5:50,
title = "Mean/minimum time for 20 samples\nusing CSV, DataFrames; DataFrame(CSV.File(...))",
color = 5)
scatter!((1:nrow(jdf)) .- 0.5, minimum.(jdf.times), label = "Minimum", markerstrokewidth = 0.01,
markersize = 6, color = 6)
print(select(jdf, :version, :CSV_version, :DataFrames_version,
:average => (x -> round.(x, digits = 1)) => :average,
:times => (x -> round.(minimum.(x), digits = 1)) => :minimum,
:times => (x -> round.(std.(x), digits = 1)) => :std_dev))
Iβm hoping this is the right place to share individual package improvements but there has been a long standing issue at Bessels.jl about long compile times which is a big disadvantage in basic math functions compared to wrapped C/Fortran libraries where TTFX was sometimes >10x slower.
Luckily, all of this time was spent on LLVM code generation and there are no invalidations so this should be a good test case.
Bessels load details
# Version 1.8.2 (2022-09-29)
julia> @time @eval using Bessels
0.021906 seconds (44.90 k allocations: 4.828 MiB)
julia> @time @eval besselj(1.2, 8.2)
0.422299 seconds (52.75 k allocations: 2.216 MiB, 99.93% compilation time)
0.21735942469289246
# Version 1.9.0-beta2 (2022-12-29)
julia> @time @eval using Bessels
0.018595 seconds (22.22 k allocations: 2.496 MiB)
julia> @time @eval besselj(1.2, 8.2)
0.000137 seconds (46 allocations: 2.266 KiB)
0.21735942469289246
Package load times are slightly better (a few samples suggest between 5-10%) on 1.9.0-beta2 while TTFX is 3,000x faster (0.4s β 0.0001s). Of course, comparatively to some of the use cases where TTFP can take tens of seconds the user doesnβt feel the effect as much, but this now makes the initial call of a native Julia function faster than the wrapped C/Fortran version (of course this isnβt exactly a fair comparison as this could undoubtedly be improved).
Special Functions comparison
# Version 1.9.0-beta2 (2022-12-29)
julia> @time @eval using SpecialFunctions
0.102223 seconds (181.37 k allocations: 13.283 MiB, 1.77% compilation time)
julia> @time @eval besselj(1.2, 8.2)
0.055451 seconds (85.39 k allocations: 5.645 MiB, 130.88% compilation time)
0.2173594246928925
What does @eval
actually do here? Is it necessary?