FlexiJoins.jl
is a fresh take on joining tabular or non-tabular datasets in Julia: Alexander Plavin / FlexiJoins.jl · GitLab.
From simple joins by key, to asof joins, to merging catalogs of terrestrial or celestial coordinates – FlexiJoins
supports any usecase. The package is registered in General
.
I’m not aware of any other similarly general implementation, neither in Julia nor in Python. At the same time, it’s only 366 lines of code!
Defining features that make the package flexible:
- Wide range of join conditions: by key (so-called equi-join), by distance, by predicate, the closest match (asof join)
- All kinds of joins, as in inner/left/right/outer
- Results can either be a flat list, or grouped by the left/right side
- Various dataset types transparently supported (not all
Tables
work, though)
With all these features, FlexiJoins is designed to be easy-to-use and fast:
- Uniform interface to all functionaly
- Performance close to other, less general, solutions: see benchmarks
- Extensible in terms of both new join conditions and more specialized algorithms
Usage examples showcasing main features:
innerjoin((objects, measurements), by_key(:name))
leftjoin((O=objects, M=measurements), by_key(x -> x.name); groupby=:O)
innerjoin((M1=measurements, M2=measurements), by_key(:name) & by_distance(:time, Euclidean(), <=(3)))
innerjoin(
(O=objects, M=measurements),
by_key(:name) & by_pred(:ref_time, <, :time);
multi=(M=closest,)
)
Documentation with explanations and more examples is available as a Pluto notebook. Docstrings also exist, but are pretty minimal for now.
I’ve been building FlexiJoins
piece by piece for some time, based on what I needed. The interface and underlying implementation has proven to be flexible and extensible enough, but comments and suggestions are welcome.