Conditional left join 2 dataframes when none of the columns are common

Nice catch! I only used closed intervals myself and didn’t even think of testing with open/halfopen. Indeed, they don’t work yet, but easy to add - will likely do it in a few days.

Please post when your code has been updated. I like the flexijoin solution the best so far and don’t want to retort to other solutions.

In the meantime, you may want to try this expression

leftjoin((tbl1, tbl2), by_pred(:id, ∈, r -> Interval(r.id1, r.id2-eps(r.id2/1.))))

# or if r.id2 are integers

leftjoin((tbl1, tbl2), by_pred(:id, ∈, r -> Interval(r.id1, r.id2-1)))


My actual data are ZonedDateTime in miliseconds and moving them back and forth by a bit can screw things up. I doubt it will be correct.

So (apart from units of measure) your numbers are integers !?
therefore {x € Z: x <n} = {x € Z: x <= n-1}, or not?

I understand but I am not willing to take that chance because I have been burnt in the past with ZonedDateTimes in c# and R and other languages as well where subtracting TimeSpan from a date which would lead to the time being after daylight savings was applied v/s before giving me incorrect answers. The corner cases are quite a lot and I do not wish to introduce another set of issues when dealing with a big data set containing millions/billions of rows.

here is an example in c# that has burnt me badly in the past: c# - Why doesn’t subtracting two local DateTime values appear to account for Daylight Saving Time? - Stack Overflow

You would think the DateTime object in c# would understand time zones when doing math but that’s not the case.

not worth the risk for me to try and subtract and then spend days figuring how did I screw things up

@vtomar: added in FlexiJoins@0.1.4, this version should already be available for installation.

1 Like