Is there any existing framework or set of macros for building mutable container types that can be easily logged into tabular data? Hereβs a dumb example:
using DataFrames
srand(1)
mutable struct Person
age::Int64
id::Int64
end
function update!(person::Person)
person.age+=1
end
function run_simulation(timesteps::Int64, num_people::Int64)
results = DataFrame(person_id = repeat(1:num_people, inner=timesteps),
timestep = repeat(1:timesteps, outer=num_people),
person_age = repeat([0], inner=num_people*timesteps))
all_people = [Person(rand(0:80),i) for i in 1:num_people]
for n in 1:num_people
for t in 1:timesteps
update!(all_people[n])
results[(n-1) * timesteps + t, :person_age] = all_people[n].age
end
end
results
end
julia> run_simulation(5,3)
15Γ3 DataFrames.DataFrame
β Row β person_id β timestep β person_age β
βββββββΌββββββββββββΌβββββββββββΌβββββββββββββ€
β 1 β 1 β 1 β 36 β
β 2 β 1 β 2 β 37 β
β 3 β 1 β 3 β 38 β
β 4 β 1 β 4 β 39 β
β 5 β 1 β 5 β 40 β
β 6 β 2 β 1 β 66 β
β 7 β 2 β 2 β 67 β
β 8 β 2 β 3 β 68 β
β 9 β 2 β 4 β 69 β
β 10 β 2 β 5 β 70 β
β 11 β 3 β 1 β 78 β
β 12 β 3 β 2 β 79 β
β 13 β 3 β 3 β 80 β
β 14 β 3 β 4 β 81 β
β 15 β 3 β 5 β 82 β
So this works, but for more complicated examples doing the indexing will be annoying. Using Query.jl, DataFramesMeta, or a similar package, one can slice in more elegantly, but even so it still is a lot of work to express the slicing condition, and has to be constructed manually, something like where table.key_variable_one == this_element.key_variable_one && table.key_variable_two==this_element.key_variable_two etc., set all value variables to this_element.value variables
.
I would guess others have already encountered this problem, and there is some package that provides a way to put a macro annotation of a record type regarding which are key variables and which value variables, and then automatically log the variable in tabular form correctly?
A solution might look like
@loggable mutable struct Person
@idvar person_id::Int64
@valuevar person_age::Int64
end
...
results = @logging DataFrame Person timestep=1:timesteps
for ...
@logging for timestep ...
person = all_people[n]
update!(person)
@log results person
end
end
That is, the framework would know how to unpack the value variables of the Person
object into a row of results
matched by the key variables and the timestep, where results
is setup to have one additional column to track in this case the time step as defined by the phrase timestep=1:timesteps
.
This doesnβt seem like all that original an idea, so Iβd be surprised if someone hasnβt already come up with a methodology for it?