Joining on a DataFrame on an argument and column

Hello,
Is it possible to join DataFrames on an argument and a column header? E.g.:

New_DF = innerjoin(DF1,DF2, on = ID => :Store_ID)

where “ID” is an argument of a function and “:Store_ID” is a specific column within DF2?

For more context, I have user input data that contains a DataFrame and I do not know the column name of the ID / Key. There is another DataFrame created in a separate function that I will eventually need to join with the user input data. Because the second DataFrame is created it will always have a consistent column header and thus I know what key to use.

I would expect this to work:


function join_the_df(;data_1::DataFrame,key::String)
    DF2 = DataFrame(ID=[20, 40], Job=["Lawyer", "Doctor"])
    New_DF = innerjoin(data_1,DF2, on = key => :ID)
    return(New_DF)
end

join_the_df(;
    data_1 = User_DF,
    key = "Identification"
    )

but it throws the error:
ERROR: TypeError: in keyword argument on, expected Union(AbstractVector, var"#s493"} where var"#s493"<:Union{AbstractString, Pair{Symbol, Symbol}, Symbol, Tuple{Symbol, Symbol}, Tuple{Symbol, Symbol}, Pair{<:AbstractString, <:AbstractString}}, got a value of type Pair{String, Symbol}

(btw how do I copy out of the Terminal in VSCode so I don’t have to type this error message? :slight_smile: )

If I modify the code and hard-code the value in, it works… but this would defeat the purpose of the function. E.g.:


function join_the_df(;data_1::DataFrame)
    DF2 = DataFrame(ID=[20, 40], Job=["Lawyer", "Doctor"])
    New_DF = innerjoin(data_1,DF2, on = :Identification => :ID)
    return(New_DF)
end

join_the_df(;
    data_1 = User_DF,
    )

Thank you for your help!

Have you tried:

New_DF = innerjoin(DF1,DF2, on = Symbol(ID) => :Store_ID)

The :somename notation is a Julia literal of type Symbol which can also be written Symbol("somename").

1 Like