Delete rows in DataFrame Conditionally

aedunn12 · February 17, 2020, 8:43pm

I’m trying to delete specific rows in my DataFrame based on conditionals of the row above. In my data, there is a repeated row of the same time point at each time that a medication is administered. My goal is to delete all repeated rows.

I’ve tried the following and was looking if Queryverse had some type of deletion tool with no luck:

for i = 1:length(infusion_single.time)
if infusion_single.time[i] == infusion_single.time[i+1]
delete!(infusion_single.time[i+1])
end
end

Thanks!

tbeason · February 17, 2020, 8:55pm

Have you looked at the unique! function?

pdeffebach · February 17, 2020, 9:05pm

If all the duplicates are the exact same row, unique(df) will work.

If you are just looking for duplicates where the rows are next to each other, here is a function that will work.

One thing that makes your loop tough is that you are modifying the thing you are looping over, which makes behavior tough to reason about. My function makes a new data frame and then pushes rows to it.

function dropdupecols(df, cols)
    new_df = DataFrame()
    last_row = df[2,:] # just so we can declare it and update it. Can't use the first row. 
    for row in eachrow(df)
        if row[cols] != last_row[cols]
           push!(new_df, row)
        end
        last_row = row
    end
    return new_df
end

tbeason · February 17, 2020, 9:59pm

unique! takes a column argument as well, yea?

unique!(infusion_single,:time)

jules · February 18, 2020, 8:36pm

This should work if I understand correctly that time is a numeric column:

infusion_single = infusion_single[[true; diff(infusion_single.time) .!= 0], :]

The true keeps the first row because diff returns an array one element shorter than the column.

Topic		Replies	Views
Dataframe delete duplicate with condition New to Julia dataframes	2	2323	September 25, 2019
Delete duplicate rows in a DataFrame New to Julia dataframes	10	6109	June 22, 2023
Delete all rows contained in a dataframe, as specified by an array of ids New to Julia	3	327	March 10, 2021
Delete row from DataFrame in place based on entire row value New to Julia question , dataframes	7	617	April 4, 2023
Filtering dataframe for unique rows with respect one of column New to Julia question , dataframes	1	52	July 18, 2024

Delete rows in DataFrame Conditionally

Related topics