Side effects of using keyword allowduplicates in the unstack function

rocco_sprmnt21 · October 19, 2021, 4:14pm

If I have this df

df1 = DataFrame(
           field = repeat(["A", "B", "C", "D", "E"], 12),
           data = rand(60),
           group = repeat(1:4, outer = 15))

if i try to do the following i get the suggestion to use allowduplicates

julia> unstack(df1,:group,:data)
ERROR: ArgumentError: Duplicate entries in unstack at row 21 for key ("A",) and variable 1. Pass allowduplicates=true to allow them.

following the suggestion I get the following result.

julia> unstack(df1,:field,:data,allowduplicates=true)
4×6 DataFrame
 Row │ group  A         B         C         D         E        
     │ Int64  Float64?  Float64?  Float64?  Float64?  Float64? 
─────┼─────────────────────────────────────────────────────────
   1 │     1  0.375523  0.341876  0.360489  0.925469  0.593533
   2 │     2  0.310565  0.21974   0.121332  0.730402  0.787124
   3 │     3  0.37437   0.164914  0.350874  0.857969  0.291186
   4 │     4  0.856412  0.905269  0.132294  0.168782  0.232174

which would be equivalent to the following code, but done explicitly.


cgl=combine(groupby(df1,[:group,:field]),:data=>last)

ucgl=unstack(cgl, :field,:data_last)

I am sure that the matter has been dealt with, but I have not found where and how.
I wonder if in these cases it is not useful to add a function that does aggregation, instead of kw = allowduplicates.

bkamins · October 19, 2021, 4:52pm

This could be discussed. The issue with arbitrary aggregation is that it might not have an eltype that matches eltype of the column.

Topic		Replies	Views
Unsure how to solve error message when applying unstack to DataFrame General Usage question , package , dataframes , unstack	6	3623	May 14, 2024
Fill option in dataframe unstack not recognized General Usage dataframes	2	412	June 19, 2022
Best practice to unstack a dataframe with a lot of columnes? General Usage question , dataframes	5	1247	March 11, 2020
Groupby, aggregate with unstack on multiple columns General Usage	4	3608	October 12, 2020
Combining a col from each DF group into a single DF New to Julia question , dataframes	5	296	August 25, 2022

Side effects of using keyword allowduplicates in the unstack function

Related topics