[ANN] FlexiJoins.jl: fresh take on joining datasets

Update v0.1.28

A brief announcement of FlexiJoins updates since the last published version. The changes are relatively minor but can be important in certain cases.

  • Cleaned up dependencies, most notably removing Static.jl that caused version conflicts with other packages.
  • Added (isapprox) join predicate with atol. It’s just a more convenient alternative to interval inclusion predicates.
  • DataFrames support: now accept AbstractDataFrames as well, as suggested by @bkamins.
  • Some performance optimizations.

Also:

  • New benchmarks prompted by discussion with @bkamins (Performance vs DataFrames.jl (#3) · Issues · Alexander Plavin / FlexiJoins.jl · GitLab) indicate that FlexiJoins aren’t always as performant as DataFrames native joins. The latter handle quite a lot of “special cases” explicitly in a more efficient way. I see no fundamental issues with implementing similar optimizations within the FlexiJoins framework and interface, but these aren’t in my plans for the near future.
  • Published two new companion packages with a similar design approach: group-by for a wide range of datasets FlexiGroups, and extensions of the map function convenient for data manipulation FlexiMaps
5 Likes