I have the following structure:
DataFramecontaining parameters and results from numeric simulations.
Dictcontains keys equivalent to some of the columns of the
DataFrameand the corresponding values.
- A mapping of sort that assigns some
resultto each case in the list of
Dicts. This mapping is hard-coded, so I manually assign a value for each case
using DataFrames df = DataFrame(:a => [1,2,3,1], :b => [5,6,7,8], :c =>[9,10,11,9]) cases = [Dict(:a => 1, :c => 9), Dict(:a => 3, :c => 11)] mapping = Dict(case => value for (case,value) in zip(cases,["result1","result2"]))
What I would like to have in the end is:
df_final = DataFrame(:a => [1,2,3,1], :b => [5,6,7,8], :c =>[9,10,11,9], :result => ["result1", missing,"result2","result1"]) 4×4 DataFrame Row │ a b c value │ Int64 Int64 Int64 String? ─────┼────────────────────────────── 1 │ 1 5 9 result1 2 │ 2 6 10 missing 3 │ 3 7 11 result2 4 │ 1 8 9 result1
So essentially I would like to loop through the list of cases, find all rows of
df where all parameters are identical to the current case (this will be several rows) and then set the value of the
result column accordingly. It would be nice if I wouldn’t have to explicitly code down all of the parameters to check for.
A very ugly hack to achieve this would be
for (case,value) in mapping @view(df[vec(all(hcat((df[!,k] .== case[k] for k in keys(case))...),dims=2)),:]).value .= value end
but I would much rather get there with
I could also try to create a
GroupedDataFrame first based on the keys from the case
gdf = groupby(df, [keys(cases)...])
But then I still have to map each group to its corresponding
result, if it exists, which is kind of the wrong way round of the loop.
Does anyone have advice on how to best achieve this?