[ANN] FlexiJoins.jl: fresh take on joining datasets

aplavin · October 20, 2022, 12:52pm

Update v0.1.28

A brief announcement of FlexiJoins updates since the last published version. The changes are relatively minor but can be important in certain cases.

Cleaned up dependencies, most notably removing Static.jl that caused version conflicts with other packages.
Added ≈ (isapprox) join predicate with atol. It’s just a more convenient alternative to interval inclusion predicates.
DataFrames support: now accept AbstractDataFrames as well, as suggested by @bkamins.
Some performance optimizations.

Also:

New benchmarks prompted by discussion with @bkamins (Performance vs DataFrames.jl (#3) · Issues · Alexander Plavin / FlexiJoins.jl · GitLab) indicate that FlexiJoins aren’t always as performant as DataFrames native joins. The latter handle quite a lot of “special cases” explicitly in a more efficient way. I see no fundamental issues with implementing similar optimizations within the FlexiJoins framework and interface, but these aren’t in my plans for the near future.
Published two new companion packages with a similar design approach: group-by for a wide range of datasets FlexiGroups, and extensions of the map function convenient for data manipulation FlexiMaps

Topic		Replies	Views
FlexiJoins vs SortMerge (particularly in astronomy workflows) Astro/Space	7	109	May 5, 2025
Left join algorithm for columnar tables General Usage tables , flexijoins	4	118	November 18, 2024
Asof join support in DataFrames.jl Data dataframes	9	917	October 13, 2022
[ANN] FlexiGroups.jl -- composable and general dataset group-bys Package Announcements	1	551	November 2, 2022
Perform spatial join with GeoDataFrames New to Julia dataframes , geo	3	875	January 20, 2023