I don’t generally work with much tabular data, so I could use some input on the best way to do a little CSV processing task that popped up today.
The data I’m working with is here, and is evaluation metrics for a bunch of different algorithms. The columns I care about are
metric (one of 4 metrics used to score the algorithms),
method (one of 24 algorithms), and
score (the metric score for that algorithm).
I’d like to average the
score across all the rows with the same
method. Bonus points if I can add columns for each of the 4 metrics so i end up with a grid with the 4 metrics along the top and each of the methods as rows.
So far I’ve tried using
Query.jl, (loading the CSV as a
DataTable) but I’m having trouble grokking it and I’m not sure which issues I hit are failures of my understanding or bugs. I also tried using
DataFramesMeta.jl, but it seems that
CSV.jl gives a
Nullable columns and the my queries weren’t working.
Sorry for the hold-my-hand question, normally I’d try to work further on finding a solution but given the state of flux of the data ecosystem I figured someone with more expertise would be able to point me in the right direction more quickly.