Transforming a DataFrame column by mapping with a Dict

Let’s say I have a DataFrame and a Dict:

df = DataFrame(
    a = 1:3,
    b = ["beetle", "fly", "spider"])

arthropod_dict = Dict(
    "beetle" => "insect",
    "fly" => "insect",
    "spider" => "arachnid")

And I’d like to apply the Dict like a map to the DataFrame, producing

df = DataFrame(
    a = 1:3,
    b = ["beetle", "fly", "spider"],
    c = ["insect", "insect", "arachnid"])

Is there a way to do that using transform or another DataFrame function so that I can apply the Dict within a @chain call?

1 Like

transform(df, :b => ByRow(x -> arthropod_dict[x]) => :c)

4 Likes

Nice (own) answer.
I really like using the @eachrow macro in DataFramesMeta.jl too:

@eachrow df begin
           @newcol c::Vector{String}
           :c = arthropod_dict[:b]
           end

also compatible within a @chain call:

@chain df begin
           filter(:a => >(2), _)  # for example
           @eachrow begin
                @newcol c::Vector{String}
                :c = arthropod_dict[:b]
             end
end

It’s a bit more than a one-liner but handy for workflows with more complicated transforms.

2 Likes

Note that there is no requirement to use transform for everything in DataFrames, and at times I find it actively reducing code clarity. In my opinion (and this is nothing more!) this is one of those situations. I would write:

df[!, :c] = [arthropod_dict[x] for x ∈ df.b]
6 Likes

I would like to add a different solution (which, incidentally, also takes into account situations where the dictionary is not complete)

df = DataFrame(
    a = 1:4,
    b = ["beetle", "fly", "spider","unicorno"])

arthropod_dict = Dict(
    "beetle" => "insect",
    "fly" => "insect",
    "spider" => "arachnid")
dict=DataFrame(from=collect(keys(arthropod_dict)),to=collect(values(arthropod_dict)))
in this case a sort of function transpose(dataframe) would have been convenient

outerjoin(df,dict,on=:b=>:from)

a variant @nilshg solution

df[!, :c] = [get(arthropod_dict,x,"unknown") for x ∈ df.b]
2 Likes

Just a quick thank you for this. I’ve been wrestling for hours with trying to do just this and coming very close, but eating lots of errors.

1 Like