Direct interface to Polars Rust library

dmbates · May 23, 2023, 2:38pm

I was thinking of the benchmarks mentioned in recent posts on https://discourse.julialang.org/t/the-state-of-dataframes-jl-h2o-benchmark/ when I mentioned DuckDB. The latest version of the benchmarks were performed by the DuckDB folks to show the speed of their most recent version. The benchmark results also show that Polars is very fast on these benchmarks. The mention of DuckDB was only to provide motivation, through their benchmarks, for interfacing to Polars.

It is certainly possible to use PyCall or PythonCall and the Python bindings for Polars to do the types of summaries shown in the benchmarks. But I don’t know if that route will cause copies of large objects to be made. I believe that the existing Arrow.Table and Arrow.write functions in Arrow.jl can provide for zero-copy exchange of Arrow tables between Julia and Rust, but I don’t know much about the Rust end of that.

The point of interfacing to Polars is not to replace DataFrames.jl but rather to enhance it for cases of large complicated joins and summaries.

Topic		Replies	Views
Pola.rs vs DataFrames.jl Performance dataframes	8	5040	February 23, 2023
A serious data start-up structured around a Julia data manipulation framework for larger-than-RAM data Offtopic	24	854	September 16, 2024
What's the latest and greatest in data in Julia Data	29	2105	August 15, 2024
Polars: Our Big Missed Opportunity Offtopic python , package-compiler	11	1215	February 5, 2025
Challenges with Arrow and Parquet in a (reasonably substantial) Julia Project General Usage	57	3203	May 6, 2024

Direct interface to Polars Rust library

Related topics