Repo: https://github.com/ericphanson/TrackingTimers.jl
TrackingTimers.jl is a very simple package that provides an easy way to “instrument” a function so that whenever it is called timing data (from @timed
) for that call is recorded in a table.
There is just one export, a TrackingTimer
, which acts as a Tables.jl-compatible row table as well as a callable type to instrument a function. For example,
julia> using TrackingTimers
julia> t = TrackingTimer() # create the timer object
TrackingTimer: 0.54 s since creation (0% measured).
No entries.
julia> instrumented_sum = t(sum) # instrument a function
(::TrackingTimers.InstrumentedFunction{typeof(sum)}) (generic function with 1 method)
julia> instrumented_sum(1:100) # call the instrumented function
5050
julia> t # uses PrettyTables to `show` the result
TrackingTimer: 0.62 s since creation (0% measured).
name time gctime n_allocs allocs thread ID proc ID
─────────────────────────────────────────────────────────────
sum 0.00 s 0% 0 0 bytes 1 1
This acts like a very simple version of TimerOutputs.jl, which keeps track of nested timers (instead of just logging all calls into a flat table like TrackingTimers does). The simplicity comes with a key benefit, however, that concurrency is well-supported.
Instrumented functions can be called by multiple threads or processes simultanenously, and the table will remain coherent. It is implemented by simply push!
ing results into a Channel
when the instrumented function is called. When it comes time to process the results, the Tables.rows
method (used in show
, for example) takes results from the channel (protecting access via a lock) and populates a simple vector-of-namedtuples row table.
Please see the README for more examples. (You might note a failing CI badge; I think this is a flakiness of CI with threading on GitHub actions-- the tests with Transducers.jl seem to sporadically hang. But I haven’t been able to reproduce the hangs locally).
11 Likes
As another example, a bit closer to how I have been using TrackingTimers:
julia> using TrackingTimers, DataFrames, Statistics, LinearAlgebra
julia> datasets = [ DataFrame(:x => [ rand(BigFloat, 100) for _ = 1:1000]) for _ = 1:100];
julia> t = TrackingTimer()
TrackingTimer: 0.00 s since creation (0% measured).
No entries.
julia> results = [ select(data, :x => t(ByRow(sum)) => :sum, :x => t(ByRow(mean)) => :mean, :x => t(ByRow(norm)) => :norm) for data in datasets];
julia> t
TrackingTimer: 5.21 s since creation (99% measured).
name time gctime n_allocs allocs th ⋯
────────────────────────────────────────────────────────────────────────────────────
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.57 s 94% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.41 s 92% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.29 s 91% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.20 s 88% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.08 s 68% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.07 s 66% 608001 32.433 MiB ⋯
ByRow{typeof(mean)}(Statistics.mean) 0.06 s 90% 202001 10.796 MiB ⋯
ByRow{typeof(mean)}(Statistics.mean) 0.06 s 90% 202001 10.796 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.04 s 0% 608001 32.433 MiB ⋯
ByRow{typeof(norm)}(LinearAlgebra.norm) 0.03 s 0% 608001 32.433 MiB ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋱
2 columns and 279 rows omitted
julia> combine(groupby(DataFrame(t), :name), :time => mean => :average_time, :bytes => mean => :average_allocs_in_bytes)
3×3 DataFrame
Row │ name average_time average_allocs_in_bytes
│ String Float64 Float64
─────┼──────────────────────────────────────────────────────────────────────────
1 │ ByRow{typeof(sum)}(sum) 0.00236519 120128.0
2 │ ByRow{typeof(mean)}(Statistics.m… 0.00813896 1.13201e7
3 │ ByRow{typeof(norm)}(LinearAlgebr… 0.0410357 3.40081e7
The example is completely artificial (and for something like this I probably wouldn’t make 100 different dataframes, but there are workflows where that kind of thing comes up), but the idea is to show one way to use it with DataFrames and how getting a table lets you do more processing of the results; for example, we could look at a histogram of GC time.
1 Like