df2 = @where(df1, yearmonth.(:date_column) in yms)
This gives me some error about booleans. Iβve tried other syntaxes (not using dataframesmeta), ANDβd greater/less thans, and so on, to not avail. Just a lot of opaque-to-me errors.
My next stop is to make a column of booleans to filter on, but before then I wanted to get some input. Iβm finding the DateTime thing generally kind of awkward, with a lot of things like averaging or plotting durations requiring extra work.
Finally, I am using timeseries, but for these parts of the process I donβt have a homogeneous (single-type) array, which timeseries requires.
julia> using DataFrames, Dates
julia> dates = [Date(2019,4),Date(2019,5),Date(2019,6)];
julia> yms = [yearmonth(d) for d in dates];
julia> df = DataFrame(a = rand(4), b = [Date(2019,7), dates...]);
julia> df[in(yms).(yearmonth.(df.b)), :]
3Γ2 DataFrame
β Row β a β b β
β β Float64 β Date β
βββββββΌββββββββββββΌβββββββββββββ€
β 1 β 0.896528 β 2019-04-01 β
β 2 β 0.711173 β 2019-05-01 β
β 3 β 0.0872949 β 2019-06-01 β
To explain (as Iβve been stumped by this and the documentation isnβt great), doing in(yms) creates a Fix2 function object, which can then be called with a single argument to check whether that argument is in yms, e.g.
This Fix2 object is then broadcast over column df.b, which in itself has been transformed by a broadcasted call to yearmonth; the result of this gives you a BitArray that is your boolean mask for row selection:
Is it worth adding this somewhere to the docs or your tutorial? Apologies if Iβve missed it, but this was a question I asked a while back myself and I feel like Iβve answered a few similar ones here and on Slack since.