using CSV, DataManipulation, StructArrays, Tables
julia> @p begin
"""
Year,CountryName,Population!!Estimate,Population!!MarginOfError,GDP!!Estimate,GDP!!MarginOfError
2025,China,1400000000,100000000,18000000000000,1000000000000
2025,India,1400000000,100000000,3000000000000,1000000000000
"""
StructArray(columntable(CSV.File(IOBuffer(__))))
flatmap() do r
[
(;r.Year, r.CountryName, k => getproperty(r, k))
for k in propertynames(r)[3:end]
]
end
end
8-element Vector{NamedTuple{names, Tuple{Int64, String7, Int64}} where names}:
(Year = 2025, CountryName = "China", Population!!Estimate = 1400000000)
(Year = 2025, CountryName = "China", Population!!MarginOfError = 100000000)
(Year = 2025, CountryName = "China", GDP!!Estimate = 18000000000000)
(Year = 2025, CountryName = "China", GDP!!MarginOfError = 1000000000000)
(Year = 2025, CountryName = "India", Population!!Estimate = 1400000000)
(Year = 2025, CountryName = "India", Population!!MarginOfError = 100000000)
(Year = 2025, CountryName = "India", GDP!!Estimate = 3000000000000)
(Year = 2025, CountryName = "India", GDP!!MarginOfError = 1000000000000)
This is DataFrames.stack
implemented with flatmap
. It works fine but I don’t like the code style.
- It uses column names for two and indexes for the rest. I’d rather (1) use the two names and refer to the rest by omission, or (2) say
1:2
and3:end
. - It requires
getproperty
- ugly. - It requires a list comprehension - verbose.
Is there a nicer way to do this?
cc @aplavin