Best practice to unstack a dataframe with a lot of columnes?

so, unlike normal ordering (like what Italy CDC did, JHU people decided to put dates in column names.

#first(df_us, 3)

	State	Country	Lat	Long	1/22/20	1/23/20	1/24/20	1/25/20	1/26/20	1/27/20	1/28/20	1/29/20	1/30/20	1/31/20	2/1/20	2/2/20	2/3/20	2/4/20	2/5/20	2/6/20	2/7/20	2/8/20	2/9/20	2/10/20	2/11/20	2/12/20	2/13/20	2/14/20	2/15/20	2/16/20	2/17/20	2/18/20	2/19/20	2/20/20	2/21/20	2/22/20	2/23/20	2/24/20	2/25/20	2/26/20	2/27/20	2/28/20	2/29/20	3/1/20	3/2/20	3/3/20	3/4/20	3/5/20	3/6/20	3/7/20	3/8/20	3/9/20	3/10/20
	String⍰	String	Float64	Float64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64
1	Washington	US	47.4009	-121.49	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	267
2	New York	US	42.1657	-74.9481	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	173
3	California	US	36.1162	-119.682	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	144

now, the question is how to stuck unstack so that, in the end, I can do some sane plotting such as

@df df plot(:Dates, :Number, group=:State)

Currently I have to look by eachrow and prepare dates ahead by collecting names(df), access data point by collect(values(row[5:end])).

P.S, for Italy’s data:

df_italy = HTTP.get("https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-province.json").body |>
JSON3.read |> DataFrame
@df df_italy plot(:data, :totale_casi, group=:sigla_provincia, xrotation=40)

produce this

Maybe use IndexedTables.stack()?

https://juliacomputing.github.io/JuliaDB.jl/latest/api/

Why not to use DataFrames functions?
Reshaping · DataFrames.jl.

I’ve used this approach: Covid-19 julia reshape · GitHub

2 Likes

Yea I think you should be able to just use stack from DataFrames for this. Something like (guessing)

stack(df,5:end,:State,:Date,:Number)

yes I was refering to Dataframes functions, your approach makes sense, thx