A visual log of loading and precompilation times for many packages over a variety of julia versions

At Julia Ecosystem Benchmarks Explorer you can find detailed benchmarks of loading time for many packages over many different versions of julia, all interactively explorable.

This is the culmination of:

It will be updated daily. Here are a few examples:

I will not have the bandwidth to make improvements to it in the near future, but do not hesitate to submit patches.

21 Likes

Very cool!

I have a few questions:

1 Like

All good points, I should add this to the docs

No. For each historic date I check out the General registry to its state on that date, and set it as the source of truth in the active julia depot. So all version resolutions happen as if it was that date. I manually put bounds on which date corresponds to which julia version, so a julia version is never tested on a General registry state that did not exist during the julia version lifetime.

I use juliaup so I never manually bisected or compiled julia. There are many failures for many packages throughout this dataset, but they are just not visualized. I did not investigate why exactly the failures happen.

All the benchmarks run on a lab server in the back of my office. The historical data (about a 1000 General registry snapshots) took a week or two. Now every night another benchmark run is executed (for lts, nightly, alpha, and release, as defined by juliaup).

That particular server is used for other tasks as well, most of them not computationally intensive. But the benchmarks run at high process priority with reserved resources at night, so hopefully the noise is not too bad.

2 Likes

Aha, I thought the x-axis was a sweep of commits on the julia master branch, not checkout dates of the registry. Ok.

Thankfully for any future entries under the “nightly” label, they will be both. It sounded too difficult to make that happen for historic entries too.

3 Likes

Interesting question how such data can be interpreted when packages get updated over time. You’d assume that new features might add loading time while authors should also try to battle TTFX. The daily nightly measurement that’s starting now should be useful to track behavior of Julia over time. But in the historic data it seems difficult to make assessments. I’ll have to play around a little.

it tracks the end-user experience – “if you used CairoMakie 1 year ago and thought it was fast, it’s slower now”

3 Likes

Sure but I’d want to know whose fault it is :grinning_face_with_smiling_eyes:

2 Likes

Here’s an alternative visualization. I thought that it might make sense to normalize because the packages and workloads are so different. So I took the median of a given metric within each package/workload/julia group, then rescaled those medians within each package group by dividing with the maximum value. So each package has values going up to 1 with the relative proportions between the values staying intact. The thick dots are then means over all those values for a given julia version, the small dots are the values making up the means.

There’s still considerable variation and I’m not sure if another way of normalization might be better, but at least it does reduce the impact of the different orders of magnitude of the workloads. Maybe scaling by a reference version like latest release would also work.

using CSV
using Chain
using CairoMakie
using AlgebraOfGraphics
using DataFrames
using DataFrameMacros
using StatsBase
using Downloads
using SwarmMakie # ]add SwarmMakie#jkrumbiegel-patch-1

##

df = @chain begin
    "https://raw.githubusercontent.com/JuliaEcosystemBenchmarks/julia-ecosystem-benchmarks/refs/heads/jeb_logs/data/Julia-TTFX-Snippets/ttfx_snippets_data.csv"
    Downloads.download
    CSV.read(DataFrame)
end

variables = ["precompile_time", "precompile_cpu", "precompile_resident", "loading_time", "task_time", "task_cpu", "task_resident"]

spec = sum(variables) do var
    stat = median
    stat_var = "$(stat)_$var"
    stat_var_norm = "$(stat_var)_normalized"
    stat_var_norm_mean = "$(stat_var_norm)_mean"
    @chain begin
        df
        @groupby :package_name :julia_version :task_name
        @combine stat_var = median({var})
        @subset :julia_version ∉ ("alpha", "release", "lts", "nightly")
        @groupby :package_name :task_name
        @transform stat_var_norm = @bycol {stat_var} ./ maximum({stat_var})
        @aside combined = @combine (@groupby _ :julia_version) mean({stat_var_norm})
        (
            data(combined) *
                (
                    mapping(:julia_version, stat_var_norm_mean => "") *
                    visual(Scatter)
                ) +
            data(_) *
                mapping(:julia_version, stat_var_norm => "normalized median") *
                visual(Beeswarm, markersize = 4, alpha = 0.5)
        ) *
            mapping(color = :julia_version => (s -> match(r"\d\.\d+", s).match) => "minor version") *
            mapping(layout = direct(var))
    end
end |> draw(; axis = (; width = 200, height = 200, xticklabelsvisible = false))
7 Likes