The expression has only one placeholder ‘?’, but two parameters ‘startdate’ and ‘enddate’, of which only ‘startdate’ will be used. So it should return rows with the start date if there are any. Try
SQLWhereExpression(“date >= ?1 AND date <= ?2”, startdate, enddate),
ok, so I just threw it in TidierDB since I was having some issues setting up the above method, but I was able to correctly filter by dates with sql.
DBInterface.execute(db, "select user_id, category, date from read_csv(
'https://raw.githubusercontent.com/amhg/parentModule/refs/heads/main/output2.csv',
ignore_errors=true) where date >= '2024-10-03' AND date <= '2024-10-27'") |>DataFrame
#returns 13 rows shown below
Another option (with the code and sql output are below via duckdb) is to use STRPTIME to explicitly convert… ymd converts the sql to STRPTIME('2024-10-03', '%Y-%m-%d') which might be the solution to your output issue. The STRPTIME conversion is not necessary in TidierDB either, so i am not sure why the issue is arising for you. The CSV does appear to be a bit buggy to read as i had to use ignore_errors=true
using TidierDB
db = connect(duckdb())
data = db_table(db, "read_csv(
'https://raw.githubusercontent.com/amhg/parentModule/refs/heads/main/output2.csv',
ignore_errors=true
)")
@chain t(data) begin
@select(user_id, category, date)
@filter(date >= ymd("2024-10-03"))
@arrange(date)
@aside @show_query _
@collect
end
WITH cte_1 AS (
SELECT *
FROM read_csv(
'https://raw.githubusercontent.com/amhg/parentModule/refs/heads/main/output2.csv',
delim=',',
header=true,
ignore_errors=true
)
WHERE date >= STRPTIME('2024-10-03', '%Y-%m-%d'))
SELECT user_id, category, date
FROM cte_1
ORDER BY date ASC
20×3 DataFrame
Row │ user_id category date
│ Int64 String Date
─────┼─────────────────────────────────
1 │ 3 personal 2024-10-05
2 │ 3 hobby 2024-10-06
3 │ 4 work 2024-10-06
4 │ 4 shopping 2024-10-06
5 │ 3 learning 2024-10-09
6 │ 3 work 2024-10-10
7 │ 3 learning 2024-10-16
8 │ 3 work 2024-10-16
9 │ 3 other 2024-10-17
10 │ 4 accounting 2024-10-18
11 │ 4 personal 2024-10-21
12 │ 3 accounting 2024-10-21
13 │ 3 family 2024-10-22
14 │ 3 work 2024-10-28
15 │ 3 other 2024-10-30
16 │ 4 learning 2024-10-30
17 │ 3 hobby 2024-10-31
18 │ 3 errands 2024-10-31
19 │ 3 personal 2024-11-02
20 │ 3 shopping 2024-11-06
@chain t(data) begin
@select(user_id, category, date)
@filter(date >= ymd("2024-10-03") && date <= ymd("2024-10-27"))
@arrange(date)
@aside @show_query _
@collect
end
13×3 DataFrame
Row │ user_id category date
│ Int64 String Date
─────┼─────────────────────────────────
1 │ 3 personal 2024-10-05
2 │ 3 hobby 2024-10-06
3 │ 4 work 2024-10-06
4 │ 4 shopping 2024-10-06
5 │ 3 learning 2024-10-09
6 │ 3 work 2024-10-10
7 │ 3 learning 2024-10-16
8 │ 3 work 2024-10-16
9 │ 3 other 2024-10-17
10 │ 4 accounting 2024-10-18
11 │ 4 personal 2024-10-21
12 │ 3 accounting 2024-10-21
13 │ 3 family 2024-10-22
may I ask you how do you specify the SQL Driver when executing the command:
DBInterface.execute(db, "select user_id, category, date from read_csv(
'https://raw.githubusercontent.com/amhg/parentModule/refs/heads/main/output2.csv',
ignore_errors=true) where date >= '2024-10-03' AND date <= '2024-10-27'") |>DataFrame