Essentially, I have a very large DataFrame df
. In that DataFrame there’s a column called “Total Population”. I want to put the data from df into a dataframe df_per_capita
, which divides every the cells in every column by their corresponding row value in “Total Population”.
However, I have a list of column names stored as a vector of strings non_pop_cols
. These are the names of columns that would be completely meaningless if divided by population (e.g. values that are already given per capita, or GINI coefficient). I initially tried to do this:
df_per_capita = transform(df, [Not(non_pop_cols), :"Total Population"] => ByRow((x1,x2) -> x1 / x2))
But I got the error
``ERROR: ArgumentError: idxs[1] has type InvertedIndex{Vector{String}}; only Integer, Symbol, or string values allowed when indexing by vector.
It sounds like it wants an iterator here, and I’ve tried foreach(Not(non_pop_cols)...)
, which doesn’t seem to work. I tried flattening the vector with vcat(Not(non_pop_cols)...)
, which generates an error “no method matching iterate(::InvertedIndex{Vector{String}})”.
I think (tell me if I’m wrong) the problem is that I’m trying to pass a string vector as a single x1
, when in fact it’s an entire vector of strings. …but I’m not 100% sure how to fix that. Relatively new to Julia (started learning ~3 weeks ago).
I’ve now spent an embarrassingly long time reading discourse posts and trying to tweak this to figure out why I can’t get this line to work.
Could anyone help me figure out how to get this to work?
Bear in mind I’m using transform
rather than select
because, although I don’t want to divide the non_pop_cols
in df_per_capita
by "Total Population"
, I do still want them in the DataFrame.