Sum statistic over a day

I have a DataFrame as such

│ Row │ start               │ finish              │ cals    │ DOW   │
│     │ DateTime            │ DateTime            │ Float64 │ Int64 │
├─────┼─────────────────────┼─────────────────────┼─────────┼───────┤
│ 1   │ 2017-07-19T16:58:00 │ 2017-07-19T17:58:00 │ 0.073   │ 3     │
│ 2   │ 2017-07-19T18:08:00 │ 2017-07-19T18:09:00 │ 0.163   │ 3     │
│ 3   │ 2017-07-19T18:09:00 │ 2017-07-19T18:18:00 │ 0.057   │ 3     │
│ 4   │ 2017-07-19T18:18:00 │ 2017-07-19T18:19:00 │ 0.443   │ 3     │
│ 5   │ 2017-07-19T18:19:00 │ 2017-07-19T18:20:00 │ 1.086   │ 3     │

The :cals column represents calories burned, and the :DOW represents the day of the week.
What I would like to do is sum up the calories for each day, and then group by day of the week.

Hence, after summing, a new DataFrame might look like

│ Row │ day        │ cals    │ DOW   │
│     │ DateTime   │ Float64 │ Int64 │
├─────┼────────────┼─────────┼───────┼
│ 1   │ 2017-07-19 │ 532     │ 3     │
│ 2   │ 2017-07-20 │ 234     │ 4     │
│ 3   │ 2017-07-21 │ 765     │ 5     │
│ 4   │ 2017-07-22 │ 123     │ 6     │
│ 5   │ 2017-07-23 │ 567     │ 7     │

However, I’m getting stuck on how to sum up the calories over each individual day.
Does anyone have any suggestions?

using Dates, DataFrames

data = DataFrame(start = DateTime(2000,1,1,0):Hour(1):DateTime(2000,1,14,23))
data.finish = data.start + Hour(1)
data.cals = 1:size(data, 1)
data.day = Date.(data.start)
output = by(data, :day, cals = :cals => sum)
output.dow = categorical(dayname.(dayofweek.(output.day)))
output = by(output, :dow, cals = :cals => sum)

You can adjust the code to sum directly by day of week or just the calendar day.

4 Likes

Ohhhh, this makes a lot of sense.

Thanks for the help!!