FlexiJoins.jlis a fresh take on joining tabular or non-tabular datasets in Julia: Alexander Plavin / FlexiJoins.jl · GitLab.
From simple joins by key, to asof joins, to merging catalogs of terrestrial or celestial coordinates –
FlexiJoins supports any usecase. The package is registered in
I’m not aware of any other similarly general implementation, neither in Julia nor in Python. At the same time, it’s only 366 lines of code!
Defining features that make the package flexible:
- Wide range of join conditions: by key (so-called equi-join), by distance, by predicate, the closest match (asof join)
- All kinds of joins, as in inner/left/right/outer
- Results can either be a flat list, or grouped by the left/right side
- Various dataset types transparently supported (not all
With all these features, FlexiJoins is designed to be easy-to-use and fast:
- Uniform interface to all functionaly
- Performance close to other, less general, solutions: see benchmarks
- Extensible in terms of both new join conditions and more specialized algorithms
Usage examples showcasing main features:
innerjoin((objects, measurements), by_key(:name)) leftjoin((O=objects, M=measurements), by_key(x -> x.name); groupby=:O) innerjoin((M1=measurements, M2=measurements), by_key(:name) & by_distance(:time, Euclidean(), <=(3))) innerjoin( (O=objects, M=measurements), by_key(:name) & by_pred(:ref_time, <, :time); multi=(M=closest,) )
Documentation with explanations and more examples is available as a Pluto notebook. Docstrings also exist, but are pretty minimal for now.
I’ve been building
FlexiJoins piece by piece for some time, based on what I needed. The interface and underlying implementation has proven to be flexible and extensible enough, but comments and suggestions are welcome.