Hello,
Is it possible to join DataFrames on an argument and a column header? E.g.:
New_DF = innerjoin(DF1,DF2, on = ID => :Store_ID)
where “ID” is an argument of a function and “:Store_ID” is a specific column within DF2?
For more context, I have user input data that contains a DataFrame and I do not know the column name of the ID / Key. There is another DataFrame created in a separate function that I will eventually need to join with the user input data. Because the second DataFrame is created it will always have a consistent column header and thus I know what key to use.
I would expect this to work:
function join_the_df(;data_1::DataFrame,key::String)
DF2 = DataFrame(ID=[20, 40], Job=["Lawyer", "Doctor"])
New_DF = innerjoin(data_1,DF2, on = key => :ID)
return(New_DF)
end
join_the_df(;
data_1 = User_DF,
key = "Identification"
)
but it throws the error:
ERROR: TypeError: in keyword argument on, expected Union(AbstractVector, var"#s493"} where var"#s493"<:Union{AbstractString, Pair{Symbol, Symbol}, Symbol, Tuple{Symbol, Symbol}, Tuple{Symbol, Symbol}, Pair{<:AbstractString, <:AbstractString}}, got a value of type Pair{String, Symbol}
(btw how do I copy out of the Terminal in VSCode so I don’t have to type this error message? )
If I modify the code and hard-code the value in, it works… but this would defeat the purpose of the function. E.g.:
function join_the_df(;data_1::DataFrame)
DF2 = DataFrame(ID=[20, 40], Job=["Lawyer", "Doctor"])
New_DF = innerjoin(data_1,DF2, on = :Identification => :ID)
return(New_DF)
end
join_the_df(;
data_1 = User_DF,
)
Thank you for your help!