I have a dataframe formed by reading in a CSV file as df3 = CSV.read(“data//weight_filtered.csv”, dateformat=“yyyy-mm-dd”.
Row │ Date Data
│ Date Float32
─────┼─────────────────────
1 │ 2022-07-01 201.1
2 │ 2022-07-08 202.1
3 │ 2022-07-15 203.0
4 │ 2022-07-22 200.0
5 │ 2022-07-29 200.8
6 │ 2022-07-05 202.8
This yielded the dataframe above. I would like to filter ths by dates less than say 2022-07-22. Having floundered around with different searches and forums I haven’t had any luck or managed to find a way myself, apart from editing the original CSV file. I’m sure I’m missing something simple but any help would be welcome.
Julia 1.9.3 with CSV v0.10.11
Try
filter(df3) do row
row.Date >= DateTime("2022-07-22")
end
2 Likes
julia> df
6×2 DataFrame
Row │ date data
│ Date Float64
─────┼─────────────────────
1 │ 2022-07-01 201.1
2 │ 2022-07-08 202.1
3 │ 2022-07-15 203.0
4 │ 2022-07-22 200.0
5 │ 2022-07-29 200.8
6 │ 2022-07-05 202.8
julia> df[df.date .>= df[4,1],:]
2×2 DataFrame
Row │ date data
│ Date Float64
─────┼─────────────────────
1 │ 2022-07-22 200.0
2 │ 2022-07-29 200.8
subset(df,:date=>x-> x .>= Date("2022-07-22"))
subset(df3, "Date" => ByRow(<=(Date(2022,7,22))))
I think DataFramesMeta.jl also has some helper tools to remove the ByRow
call.
1 Like
Here’s how I’d use filter!
(the !
indicates it modifies your existing DataFrame
):
julia> using DataFrames, Dates
julia> df = DataFrame(x = Date(2000):Day(1):Date(2000, 1, 10))
10×1 DataFrame
Row │ x
│ Date
─────┼────────────
1 │ 2000-01-01
2 │ 2000-01-02
3 │ 2000-01-03
4 │ 2000-01-04
5 │ 2000-01-05
6 │ 2000-01-06
7 │ 2000-01-07
8 │ 2000-01-08
9 │ 2000-01-09
10 │ 2000-01-10
julia> filter!(:x => >(Date(2000, 1, 5)), df)
5×1 DataFrame
Row │ x
│ Date
─────┼────────────
1 │ 2000-01-06
2 │ 2000-01-07
3 │ 2000-01-08
4 │ 2000-01-09
5 │ 2000-01-10
Hi Nils, Thanks for the prompt reply which works beautifully. Jeff
1 Like
Hi Tyler, Thanks for the prompt reply and solution, Jeff
Hi rocco, Thanks for the solution, Jeff
Hi Florian, Thanks for the speedy solution, Jeff