Let’s say I’ve got this dataset, where I’m interested in Weight Gain as a function of calories consumed at Breakfast, Lunch, Dinner.
4×6 DataFrame
Row │ Participant Day B L D WG
│ String Int64 Int64 Int64 Int64 Float64
─────┼──────────────────────────────────────────────────
1 │ Bob 1 65 231 345 0.5
2 │ Bob 2 300 777 674 -0.3
3 │ Mary 1 100 856 321 2.0
4 │ Mary 2 555 845 656 1.0
(etc)
I know (for the sake of example) that calories consumed at breakfast might be more readily absorbed than at lunch, or dinner (not participant-specific), and I know that each Participant may have a different general Absorption rate a_p, which affects all intakes. So I’d like to fit something like this:
WG = a_p * (b * B + l * L + d * D)
(maybe with an intercept, but that looks mathematically irrelevant?)
For N participants, that is N + 3 coefficients to fit. How should I fit this in Julia / GLM? Since the equation is linear holding b, l, d constant, I can fit a_p. Likewise I can fit b, l, d holding a_p constant. So unless I’m mistaken I could alternatively fit both of these coefficient sets, until convergence. Doing this with \
looks like a hassle. @formula
looks appealing, but I’m a total newbie with GLM.jl, and a bit overwhelmed. I can see how I can build a dataframe where I’ve done the A_p B multiplication in order to fit b, l, d, but urg, is there anything simpler? Any help / pointer / @formula
code appreciated.
(Side note: I know that this problem is underspecified, so throw in a constraint that b+l+d=1
if that’s better)