Hi all,
I want to turn this DataFrame (see first image)
df = DataFrame(Year = [1,2,3,1,2,3], Name=["Jack","Jack","Jack","Jill","Jill","Jill"], Gender=["M","M","M","F","F","F"], Cash=[10.0, 14.0, 12.0, 20.5, 21.5, 22.5], Grade=[1, 2, 3, 2, 4, 3])
into this DataFrame (see second image)
intended_df = DataFrame(Name = ["Jack", "Jill"], Gender=["M", "F"], Cash=[[10.0, 14.0, 12.0], [20.5, 21.5, 22.5]], Grade=[[1,2,3],[2,4,3]])
Basically, I’m taking the observed output values for the “Cash” and “Grade” column across years 1-3 on the same input columns (“Name” and “Gender”), and turning them into an array of values in order from years 1-3 such that I can drop the “Year” column from the DataFrame. Does anybody have a suggested command for me to do this please? In my actual DataFrame there are going to be more inputs columns and more output columns than the 2 (Name, Gender) and 2 (Cash, Grade) that I have there, but I imagine the principle will be the same still.
Thanks for your time, I hope I’ve explained the question in a sensible manner (should hopefully be a straightforward scenario). If you need any more information from me please feel free to comment and let me know.