[ANN] TrackingTimers.jl

Repo: https://github.com/ericphanson/TrackingTimers.jl

TrackingTimers.jl is a very simple package that provides an easy way to “instrument” a function so that whenever it is called timing data (from @timed) for that call is recorded in a table.

There is just one export, a TrackingTimer, which acts as a Tables.jl-compatible row table as well as a callable type to instrument a function. For example,

julia> using TrackingTimers

julia> t = TrackingTimer() # create the timer object
TrackingTimer: 0.54 s since creation (0% measured).
No entries.

julia> instrumented_sum = t(sum) # instrument a function
(::TrackingTimers.InstrumentedFunction{typeof(sum)}) (generic function with 1 method)

julia> instrumented_sum(1:100) # call the instrumented function
5050

julia> t # uses PrettyTables to `show` the result
TrackingTimer: 0.62 s since creation (0% measured).
 name   time   gctime  n_allocs  allocs   thread ID  proc ID
─────────────────────────────────────────────────────────────
 sum   0.00 s      0%         0  0 bytes          1        1

This acts like a very simple version of TimerOutputs.jl, which keeps track of nested timers (instead of just logging all calls into a flat table like TrackingTimers does). The simplicity comes with a key benefit, however, that concurrency is well-supported.

Instrumented functions can be called by multiple threads or processes simultanenously, and the table will remain coherent. It is implemented by simply push!ing results into a Channel when the instrumented function is called. When it comes time to process the results, the Tables.rows method (used in show, for example) takes results from the channel (protecting access via a lock) and populates a simple vector-of-namedtuples row table.

Please see the README for more examples. (You might note a failing CI badge; I think this is a flakiness of CI with threading on GitHub actions-- the tests with Transducers.jl seem to sporadically hang. But I haven’t been able to reproduce the hangs locally).

11 Likes

As another example, a bit closer to how I have been using TrackingTimers:

julia> using TrackingTimers, DataFrames, Statistics, LinearAlgebra

julia> datasets = [ DataFrame(:x => [ rand(BigFloat, 100) for _ = 1:1000]) for _ = 1:100];

julia> t = TrackingTimer()
TrackingTimer: 0.00 s since creation (0% measured).
No entries.

julia> results = [ select(data, :x => t(ByRow(sum)) => :sum, :x => t(ByRow(mean)) => :mean, :x => t(ByRow(norm)) => :norm) for data in datasets];

julia> t
TrackingTimer: 5.21 s since creation (99% measured).
                  name                     time   gctime  n_allocs    allocs    th ⋯
────────────────────────────────────────────────────────────────────────────────────
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.57 s     94%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.41 s     92%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.29 s     91%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.20 s     88%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.08 s     68%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.07 s     66%    608001  32.433 MiB     ⋯
 ByRow{typeof(mean)}(Statistics.mean)     0.06 s     90%    202001  10.796 MiB     ⋯
 ByRow{typeof(mean)}(Statistics.mean)     0.06 s     90%    202001  10.796 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.04 s      0%    608001  32.433 MiB     ⋯
 ByRow{typeof(norm)}(LinearAlgebra.norm)  0.03 s      0%    608001  32.433 MiB     ⋯
                    ⋮                       ⋮       ⋮        ⋮          ⋮          ⋱
                                                      2 columns and 279 rows omitted

julia> combine(groupby(DataFrame(t), :name), :time => mean => :average_time, :bytes => mean => :average_allocs_in_bytes)
3×3 DataFrame
 Row │ name                               average_time  average_allocs_in_bytes
     │ String                             Float64       Float64
─────┼──────────────────────────────────────────────────────────────────────────
   1 │ ByRow{typeof(sum)}(sum)              0.00236519           120128.0
   2 │ ByRow{typeof(mean)}(Statistics.m…    0.00813896                1.13201e7
   3 │ ByRow{typeof(norm)}(LinearAlgebr…    0.0410357                 3.40081e7

The example is completely artificial (and for something like this I probably wouldn’t make 100 different dataframes, but there are workflows where that kind of thing comes up), but the idea is to show one way to use it with DataFrames and how getting a table lets you do more processing of the results; for example, we could look at a histogram of GC time.

1 Like