Taking TTFX seriously: Can we make common packages faster to load and use

Inspired by this post on TTFX with CSV/DataFrames, I made a quick attempt at a function to run script files repeatedly.

using Statistics

"""
    ttfx(; code="sleep(1)", N=10, args = "", preview=false)

Compute the time to first X. 

`ttfx` will run a `file` `N` times to determine the total 
startup cost of running certain packages/functions. `ttfx()`
with no arguments will simply run `sleep(1)` and may be used
to estimate the base julia runtime cost.

`code` can either be a short snippet that will run with 
the `-e` switch or a file containing the script to be run.

`args` can be used to set Julia runtime options

`preview = true` will show the final command without running it.

"""
function ttfx(; code="sleep(1)", N=10, args = "", preview=false)
    # If running a short snippet and not a file, add a -e
    if !isfile(code)
        code = "-e '$code'"
    end

    # `cmd doesn't interpolate properly or something
    # so using shell_parse`and cmd_gen is the workaround
    ex, = Base.shell_parse("julia $args $code")
    julia_cmd = Base.cmd_gen(eval(ex))

    # Return only the command that would have been run
    preview && return julia_cmd
    
    # Run the command N times
    times = Vector{Float64}(undef, N)
    for i = 1:N
        times[i] = @elapsed run(`$julia_cmd`)
    end

    return median(times), times
end

# Run the default timing with sleep
t = ttfx()

# Run `using CSV` 15 times in the current project with CSV.jl installed and 8 threads
t = ttfx(code="using CSV", N=15, args="-t 8 --project=@.")

For me ttffx() takes a median time of ~1.17 seconds, so there is a baseline julia runtime cost of 0.17 second on my machine. Then using CSV on my computer takes a median time of 3.4 seconds

Feel free to edit and expand this (perhaps into the @ctime macro that was suggested). Code suggestions welcome!

8 Likes