Performing multiple melt/stack operations on a dataset simultaneously

Raf · February 9, 2018, 12:52pm

I’m importing some R data so decided to learn DataFrames manipulation in julia instead of doing it in R. I have a wide dataframe with some initial columns followed by four variables that span 12 columns, one for each month. I want them in long format arranged with month as a column. I used stack() to successfully transform for the first monthly variable from 12 columns to 12 rows with:

df = stack(df, 5:16)

Then I tried to do all of the 12 column blocks simultaneously with:

df = stack(df, [5:16, 17:28, 29:40, 41:52])

but there was no method defined for an array of ranges.

Any ideas about a similarly powerful way to do multiple stack operations at the same time?

nalimilan · February 9, 2018, 1:20pm

Use [5:16; 17:28; 29:40; 41:52] instead. With commas, you create an array of ranges instead of an array of integers.

Raf · February 10, 2018, 5:42am

Thanks! that does transform all columns ranges at once, but it appends each set of columns into the same two variable/value columns, not multiple sets of variable/value columns in the same rows. I guess I was expecting too much power built in, it’s probably not even possible in R! Maybe I’ll write a pull request that allows the comma syntax for multiple stacks in the same rows.

bkamins · February 10, 2018, 6:38am

Is this what you want (assuming that [:x1,:x2,:x3,:x4] are names of variables you do not want to stack)?

reduce((x,y) -> join(x,y,on=[:x1,:x2,:x3,:x4], makeunique=true), getindex.(stack.(x,  [5:16, 17:28, 29:40, 41:52]),[1:6]))

alternatively you can drop those not-stacked variables if you do not need them:

hcat(getindex.(stack.(x,  [5:16, 17:28, 29:40, 41:52]),[1:2])..., makeunique=true)

bramtayl · February 10, 2018, 6:56am

Unless your really pressed for performance, the easiest way is to stack the months then spread the 4 variables. Are the column names something like var_1_month_2 so that you can split out a separate column for month?

Raf · February 16, 2018, 9:18am

Thanks the reduce/join approach looks good.

Topic		Replies	Views
Memory efficient melting and stacking? Reshaping Wide to Long Performance	11	1504	November 7, 2020
Split-Apply-Combine on many columns at once? Looking for equivalent to Stata's collapse New to Julia question , dataframes	7	2013	November 9, 2021
Reshape long New to Julia question , dataframes , reshaping	4	938	November 30, 2021
Transform multiple columns of a DataFrame using the same function Data dataframes	12	4173	January 23, 2023
DataFrame unmelt-like operation Data question	1	602	December 20, 2017

Performing multiple melt/stack operations on a dataset simultaneously

Related topics