# Counting Value Occurences in Time Range in DataFrames

Hello everyone,

say, I construct a data frame like this:

``````using Dates
using DataFrames

datevec = collect(Date(2020, 02, 22):Day(1):Date(2020, 03, 03))
valuevec = ["A", "B", "A", "A", "C", "C", "A", "C", "A", "A", "B"]
df = DataFrame(d=datevec, v=valuevec)
``````

What I would like to do now is to count how often “A”, “B” and “C” occur each month. Is there any elegant way to do this?

Thomas

Here’s a solution:

``````# Add a column for the month
df.m = month.(df.d)

combine(groupby(df, [:m, :v]), nrow)
``````

or without changing the original dataframe:

``````df2 = transform(df, :d => ByRow(month) => :m)

combine(groupby(df2, [:m, :v]), nrow)
``````

Result:

``````5×3 DataFrame
Row │ m      v       nrow
│ Int64  String  Int64
─────┼──────────────────────
1 │     2  A           4
2 │     2  B           1
3 │     2  C           3
4 │     3  A           2
5 │     3  B           1
``````
1 Like

If you want to preserve dates, you can use `trunc` instead of `month`

``````julia> df2 = transform(df, :d => (x -> trunc.(x, Dates.Month)) => :m);

julia> combine(groupby(df2, [:m, :v]), nrow)
5×3 DataFrame
Row │ m           v       nrow
│ Date        String  Int64
─────┼───────────────────────────
1 │ 2020-02-01  A           4
2 │ 2020-02-01  B           1
3 │ 2020-02-01  C           3
4 │ 2020-03-01  A           2
5 │ 2020-03-01  B           1
``````
1 Like

Hello everyone and thanks for your advice. It now works as intented!