Hello,
Still learning the language, but I am excited to do some processing of financial data that I have not been pleased with doing in excel.
I have a DataFrame of transaction data in the form of:
38838×7 DataFrame
│ Row │ Name │ SKU │ L2 │ L3 │ Qty │ Rev │ Date │
│ │ String │ String │ String │ String │ Int64 │ Float64 │ Date │
I want to convert from a transaction date into a timeseries using the product name as a unique key. I have done that, but I was not pleased with my implementation as I am certain there are better ways of doing it. Probably a bit nit picky, but I am trying to learn the language better.
# Creates a new row for the timeseries dataframe
function newRow(row)::DataFrame
new_row = DataFrame(Name=row.Name, SKU=row.SKU, L2=row.L2, L3=row.L3)
for date in Date(2019, 1):Dates.Month(1):Date(2020, 12)
new_row [string(date)] = 0
end
return new_row
end
# Converts the transaction data into timeseries data
function makeTimeSeries(data)::DataFrame
# Init ts
ts = DataFrame()
append!(ts,newRow(data[1,:]))
# Build row by row
map(eachrow(data)) do row
# Check to see if it is new
did_find = false
for (i, name) in enumerate(ts.Name)
if row.Name == name
did_find = true
ts[i,:][string(row.Date)] += row.Qty
break
end
end
# Add Product to dataframe if it is new
if !did_find
append!(ts, newRow(row))
last(ts)[string(row.Date)] += row.Qty
end
end
return ts
end
I feel like there is probably a more julian way of transforming this data, but I am unsure of how I would go about that. I’m not entirely sure why my allocations are so high, and I feel like the transformation should be quicker from what I have seen from julia so far. Nothing obvious jumped out at me in the performance documentation, though I am still fairly new.
@time ts = makeTimeSeries(data)
4.693930 seconds (94.55 M allocations: 2.512 GiB, 12.47% gc time)
856×28 DataFrame. Omitted printing of 21 columns
│ Row │ Name │ SKU │ L2 │ L3 │ 2019-01-01 │ 2019-02-01 │ 2019-03-01 │
│ │ String │ String │ String │ String │ Int64 │ Int64 │ Int64 │
Anyways, any advice the community can give would be greatly appreciated