[ANN] ContextTracking.jl - writing context-aware applications

Hi everyone,

Today, I would like to announce ContextTracking.jl, a package that helps you keep track of contextual information during program execution.

A quick example

Suppose that you have 4 functions: A calls B, B calls C, and C calls D. Normally, if you have gathered some data in A and want to access it in D, you would have to pass the data downstream via function arguments. With ContextTracking, you just access it directly in D.

Unlike global variables, context information is kept only during the lifetime of the execution call chain. All that is maintained in a stack structure. When the function returns, the data is cleaned up and removed.

How is it useful?

Just taking a use case description from the project’s README:

Suppose that we are processing a web request. We may want to create a correlation id to keep track of the request and include the request id whenever we write anything to the log file during any part of the processing of that request.

It may seems somewhat redundant to log the same data multiple times but it is invaluable in debugging production problems. Imagine that two users are hitting the same web service at the same time. If we look at the log file, everything could be interleaving and it would be quite confusing without the context.

As context data is stored in a stack structure, you naturally gain more “knowledge” when going deeper into the execution stack. Then, you naturally “forget” about those details when the execution stack unwinds. With this design, you can just memoize the most valuable knowledge needed in the log file.

Demo

using ContextTracking

@ctx function foo()
    @memo x = 1
    bar()
end

@ctx function bar()
    c = context()
    @info "context data" c.data
end

Result:

julia> foo()
┌ Info: context data
│   c.data =
│    Dict{Any,Any} with 2 entries:
│      :_ContextPath_ => [:foo, :bar]
└      :x             => 1
8 Likes

Love this. I’ve needed it quite a few times. Thank you.

Any ideas/stats on the cost of not/using this in terms of time and memory for a few examples?

Maybe relevant in this context (no pun intended :slight_smile: ):

https://discourse.julialang.org/t/propagation-of-available-assigned-worker-ids-in-hierarchical-computations

1 Like

It’s not about exactly the same thing, obviously, but related in spirit.

Memory utilization should be the same as how much data you want to track.

From a performance perspective, you may want to avoid using @ctx for a hot function that’s called in a tight loop because of the overhead of maintaining the stack. However, you can still use context function to access previously recorded data.

I have used this package in a production setting for a data engineering process, and memory/performance has not been a concern for my use case. The context data that I need to track is minuscule compared to the data that I need to process. So I haven’t done much testing in regards to memory/performance overhead.

I would be delighted to see if it works well with other use cases in case that you want to give it a try! :slight_smile:

1 Like