Delete!() row from DataFrame incompatible with CSV.jl?

Hi !

Following some packages updates, the functions I add used to delete rows in dataframes, such as deleterows!(), have been deprecated and I try to make my old code running again but I face a difficult problem.

When a dataframe is read with the CSV package, I don’t find any solution to remove rows (except of course make a copy line by line of the lines I want to keep). If you have a simple solution for this, it will be helpful.
Thanks !



julia> d = DataFrame(A=["X", "Y", "Z"], a=1:3, b=4:6)
3Γ—3 DataFrame
β”‚ Row β”‚ A      β”‚ a     β”‚ b     β”‚
β”‚     β”‚ String β”‚ Int64 β”‚ Int64 β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ X      β”‚ 1     β”‚ 4     β”‚
β”‚ 2   β”‚ Y      β”‚ 2     β”‚ 5     β”‚
β”‚ 3   β”‚ Z      β”‚ 3     β”‚ 6     β”‚

julia> delete!(d, 2) # it works !
2Γ—3 DataFrame
β”‚ Row β”‚ A      β”‚ a     β”‚ b     β”‚
β”‚     β”‚ String β”‚ Int64 β”‚ Int64 β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ X      β”‚ 1     β”‚ 4     β”‚
β”‚ 2   β”‚ Z      β”‚ 3     β”‚ 6     β”‚

julia> d = DataFrame(A=["X", "Y", "Z"], a=1:3, b=4:6)
3Γ—3 DataFrame
β”‚ Row β”‚ A      β”‚ a     β”‚ b     β”‚
β”‚     β”‚ String β”‚ Int64 β”‚ Int64 β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ X      β”‚ 1     β”‚ 4     β”‚
β”‚ 2   β”‚ Y      β”‚ 2     β”‚ 5     β”‚
β”‚ 3   β”‚ Z      β”‚ 3     β”‚ 6     β”‚

julia> CSV.write("test.csv", d ; delim = "\t")
"test.csv"

julia> d = CSV.read("test.csv")
3Γ—3 DataFrame
β”‚ Row β”‚ A      β”‚ a     β”‚ b     β”‚
β”‚     β”‚ String β”‚ Int64 β”‚ Int64 β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ X      β”‚ 1     β”‚ 4     β”‚
β”‚ 2   β”‚ Y      β”‚ 2     β”‚ 5     β”‚
β”‚ 3   β”‚ Z      β”‚ 3     β”‚ 6     β”‚

julia> delete!(d, 2) # same function, same dataframe, it does not work !
ERROR: MethodError: no method matching deleteat!(::CSV.Column{String,String}, ::Int64)
Closest candidates are:
  deleteat!(::Array{T,1} where T, ::Integer) at array.jl:1238
  deleteat!(::Array{T,1} where T, ::Any) at array.jl:1275
  deleteat!(::BitArray{1}, ::Integer) at bitarray.jl:931
  ...
Stacktrace:
 [1] (::DataFrames.var"#161#162"{Int64})(::CSV.Column{String,String}) at /home/fred/.julia/packages/DataFrames/3ZmR2/src/dataframe/dataframe.jl:873
 [2] foreach(::DataFrames.var"#161#162"{Int64}, ::Array{AbstractArray{T,1} where T,1}) at ./abstractarray.jl:1919
 [3] delete!(::DataFrame, ::Int64) at /home/fred/.julia/packages/DataFrames/3ZmR2/src/dataframe/dataframe.jl:873
 [4] top-level scope at REPL[25]:1

Please do not use CSV.read as it will be soon deprecated. Instead use DataFrame(CSV.File(filename, ...)) and all will work as expected.

This is problem is with CSV.jl and is not related with DataFrames.jl.

1 Like

@bkamins thank you for your quick answer and solution that indeed works ! The bad news is that I have used CSV.read massively in the past in most of my Julia programs :frowning:

julia> d = DataFrame(CSV.File("test.csv"; delim = "\t"))
3Γ—3 DataFrame
β”‚ Row β”‚ A      β”‚ a     β”‚ b     β”‚
β”‚     β”‚ String β”‚ Int64 β”‚ Int64 β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ X      β”‚ 1     β”‚ 4     β”‚
β”‚ 2   β”‚ Y      β”‚ 2     β”‚ 5     β”‚
β”‚ 3   β”‚ Z      β”‚ 3     β”‚ 6     β”‚

julia> delete!(d, 2)
2Γ—3 DataFrame
β”‚ Row β”‚ A      β”‚ a     β”‚ b     β”‚
β”‚     β”‚ String β”‚ Int64 β”‚ Int64 β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ X      β”‚ 1     β”‚ 4     β”‚
β”‚ 2   β”‚ Z      β”‚ 3     β”‚ 6     β”‚

This will allow CSV.jl not to have DataFrames.jl as a dependency.
Maybe @quinnj will want to comment on this more.