Bad performance of group_by of DataFrames - updated -

ufechner7 · October 19, 2019, 1:24pm

Well, I construct a dataframe with colums of type String and type Float64. Than I do a group_by operation on the three string colums, aggregating the Float64 numbers with either sum or maximum.

That’s it.

The tricky part is that the columns selected for the group_by operation is random. And this is sometimes causing a recompilation.

Topic		Replies	Views
Help with performance tuning this dataframe aggregation Performance	10	738	September 23, 2018
Understanding the performance issue in combine() [DataFrames.jl] Performance dataframes	1	330	April 18, 2021
DataFrame transformation is so slow, what am I doing wrong? Performance compilation , dataframes	17	329	May 19, 2024
Julia performs poorly on group-by benchmarks Data performance	48	5783	January 23, 2019
Serious group-by performance issue with Query.jl Data	26	2331	October 13, 2019

Bad performance of group_by of DataFrames - updated -

Related topics