Awesome thanks so much, this is super helpful!
So if I wanted to keep it in a DataFrame format my only option would be create an additional column. e.g.
df.yearmonth = yearmonth.(df.date)
or
transform!(df, :date => (x ->yearmonth.(x)) => :yearmonth)
Just as a quick follow up if you have a moment- I believe I get a different result without the ( )
in the expression. i.e.
transform!(df, :date => x -> yearmonth.(x) => :yearmonth)
is that without the ( )
Julia takes the entire :date
column as x
for each row of the DataFrame.
hence
julia> transform!(df, :date => (x ->yearmonth.(x)) => :yearmonth)
12×2 DataFrame
Row │ date yearmonth
│ Date Tuple…
─────┼───────────────────────
1 │ 2024-03-25 (2024, 3)
2 │ 2024-03-26 (2024, 3)
3 │ 2024-03-27 (2024, 3)
4 │ 2024-03-28 (2024, 3)
5 │ 2024-03-29 (2024, 3)
6 │ 2024-03-30 (2024, 3)
7 │ 2024-03-31 (2024, 3)
8 │ 2024-04-01 (2024, 4)
9 │ 2024-04-02 (2024, 4)
10 │ 2024-04-03 (2024, 4)
11 │ 2024-04-04 (2024, 4)
12 │ 2024-04-05 (2024, 4)
whereas without the ( )
julia> transform!(df, :date => x ->yearmonth.(x) => :yearmonth)
12×2 DataFrame
Row │ date date_function
│ Date Pair…
─────┼───────────────────────────────────────────────
1 │ 2024-03-25 [(2024, 3), (2024, 3), (2024, 3)…
2 │ 2024-03-26 [(2024, 3), (2024, 3), (2024, 3)…
3 │ 2024-03-27 [(2024, 3), (2024, 3), (2024, 3)…
4 │ 2024-03-28 [(2024, 3), (2024, 3), (2024, 3)…
5 │ 2024-03-29 [(2024, 3), (2024, 3), (2024, 3)…
6 │ 2024-03-30 [(2024, 3), (2024, 3), (2024, 3)…
7 │ 2024-03-31 [(2024, 3), (2024, 3), (2024, 3)…
8 │ 2024-04-01 [(2024, 3), (2024, 3), (2024, 3)…
9 │ 2024-04-02 [(2024, 3), (2024, 3), (2024, 3)…
10 │ 2024-04-03 [(2024, 3), (2024, 3), (2024, 3)…
11 │ 2024-04-04 [(2024, 3), (2024, 3), (2024, 3)…
12 │ 2024-04-05 [(2024, 3), (2024, 3), (2024, 3)…
Assuming my interpretation is correct (please feel free to correct me if i’m wrong), I’m still a little confused as to why date_function
is assigned as the name of the new column when the ()
are removed?