I was thinking of the benchmarks mentioned in recent posts on https://discourse.julialang.org/t/the-state-of-dataframes-jl-h2o-benchmark/ when I mentioned DuckDB. The latest version of the benchmarks were performed by the DuckDB folks to show the speed of their most recent version. The benchmark results also show that Polars is very fast on these benchmarks. The mention of DuckDB was only to provide motivation, through their benchmarks, for interfacing to Polars.
It is certainly possible to use PyCall or PythonCall and the Python bindings for Polars to do the types of summaries shown in the benchmarks. But I don’t know if that route will cause copies of large objects to be made. I believe that the existing Arrow.Table and Arrow.write functions in Arrow.jl can provide for zero-copy exchange of Arrow tables between Julia and Rust, but I don’t know much about the Rust end of that.
The point of interfacing to Polars is not to replace DataFrames.jl but rather to enhance it for cases of large complicated joins and summaries.