Make a Copy of a DataFrame Row

quietlight · February 2, 2023, 10:24pm

I am stuck on this for loop. I need a copy of a dataframe row so I can modify it and then push the modified row onto to the dataframe.

My K-MF label means a male and female kiwi are both calling at the same time, I want to change this so I have 2 labels (or rows), one for the female call, one for the male call.

This is all happening in a function that builds my training dataset which is quite large and growing.

for row in eachrow(data_frame)
            # correct to Male, Female, Close to match newer annotations
            if row.species == "K-M"
                row.species = "Male"

            elseif row.species == "K-F"
                row.species = "Female"

            # correct K-MF label to Male, plus another identical row with label Female
            elseif row.species == "K-MF"
               row.species = "Male"
            # I need to break off a copy of row into a new dataframe with no 
            # connection to the original row
               new_row = row[:] # I thought it was a copy     
               new_row.species = "Female"
               push!(data_frame, new_row) 
            # I end up with 2 rows with species=Female, instead of 1 male, 
            # 1 female as 'new_row.species = "Female"' modifies both the new 
            # row and the original
            end
        end

Kindness
David

mrufsvold · February 2, 2023, 10:30pm

Not at a computer to test, but can you try copy(row)?

quietlight · February 2, 2023, 10:41pm

Thanks, yes I had tried already but did again anyway.

Here is the error: “ERROR: setfield!: immutable struct of type NamedTuple cannot be changed”

copy() does not return a dataframe.

pdeffebach · February 2, 2023, 10:45pm

Somewhat confusingly, copy(row) returns a NamedTuple, not a new DataFrameRow. This is unfortunately what’s causing your problem.

You can push! a dictionary to a data frame, so you might want to do

julia> Dict(k => v for (k, v) in enumerate(dfr))
Dict{Int64, Int64} with 2 entries:
  2 => 3
  1 => 1

instead of copy.

rocco_sprmnt21 · February 2, 2023, 11:07pm

try this

for row in eachrow(df)
    # correct to Male, Female, Close to match newer annotations
    if row.species == "K-M"
        row.species = "Male"

    elseif row.species == "K-F"
        row.species = "Female"

    # correct K-MF label to Male, plus another identical row with label Female
    elseif row.species == "K-MF"
       row.species = "Male"
       push!(df, merge(row, (species="Female",))) 
    end
end

rocco_sprmnt21 · February 2, 2023, 11:14pm

an alternative way

tdf=transform(df, :species=>ByRow(x->x=="K-M" ? "M" : (x=="K-F" ? "F" : ["M","F"]))=>:g)

flatten(tdf,:g)

quietlight · February 2, 2023, 11:27pm

The idea works, but since the keys of the dicts are the number of the column id is hard to merge back into the base dataframe. There are a lot of columns oor i would just bodge it manually. This works fine:

new_row = Dict(names(row) .=> values(row))       
new_row["species"] = "Female"

Thanks
David

quietlight · February 2, 2023, 11:28pm

Thanks

I like this solution.

Regards
David

pdeffebach · February 2, 2023, 11:33pm

Oh sorry. Yeah, your version is correct.

mrufsvold · February 2, 2023, 11:34pm

Looks like you got a solution, but here is a different approach:

function replace_species(df)
    # create a copy of the K-MF rows, set them all to femail
    k_mf_rows = filter(AsTable(:) => r -> r.species == "K-MF", df)
    k_mf_rows.species .= "Female"

    # replace species in place (setting original K-MFs to Male)
    species_map = Dict(
        "K-M" => "Male",
        "K-F" => "Female",
        "K-MF" => "Male"
    )
    replace!(v -> species_map[v], df.species)
    
    # Stack on the copies
    new_df = vcat(df, k_mf_rows)
    return new_df
end

Edit: FWIW, depending on the size of the DataFrame and the number of K-MF values, this solution might be faster because calling push! over and over is slow since it has to allocate each time.

Topic		Replies	Views
Modify a dictionary under a new name modifies the original dictionary General Usage question	6	1075	July 27, 2021
Add data to a DataFrame Data dataframes	6	314	March 27, 2023
How to copy a dataframe column within the same dataframe? General Usage dataframes	4	45	May 19, 2025
Create new dataframe with minor changes New to Julia dataframes , copy	11	446	April 30, 2022
Indexing DataFrame with : does not generate a copy Specific Domains dataframes	2	765	March 17, 2022

Make a Copy of a DataFrame Row

Related topics