How do you edit a DataFrame after reading it from a CSV?

Hello! I’m having a bit of trouble with being able to edit a DataFrame after reading them in from a CSV. I’ve set it up as shown below.

julia> df = DataFrame(A = String[], B = Int[])
0×2 DataFrame

julia> CSV.write("test.csv", df)
"test.csv"

julia> df2 = DataFrame(CSV.File("test.csv"))
0×2 DataFrame

However, if I try to push! a row into this DataFrame, I get this.

julia> push!(df2, ("hello", 5))
┌ Error: Error adding value to column :A.
└ @ DataFrames C:\Users\xxxxx\.julia\packages\DataFrames\oQ5c7\src\dataframe\dataframe.jl:1532
ERROR: StackOverflowError:
Stacktrace:
 [1] push!(::SentinelArrays.MissingVector, ::String) at .\array.jl:982
 [2] append! at C:\Users\xxxxx\.julia\packages\SentinelArrays\Ubf17\src\missingvector.jl:109 [inlined]
 ... (the last 2 lines are repeated 52190 more times)
 [104383] push!(::SentinelArrays.MissingVector, ::String) at .\array.jl:982
 [104384] push!(::DataFrame, ::Tuple{String,Int64}; promote::Bool) at C:\Users\xxxxx\.julia\packages\DataFrames\oQ5c7\src\dataframe\dataframe.jl:1514
 [104385] push!(::DataFrame, ::Tuple{String,Int64}) at C:\Users\xxxxx\.julia\packages\DataFrames\oQ5c7\src\dataframe\dataframe.jl:1494

Is there an alternative to get around this error? I know that some usages of CSV tend to return an immutable DataFrame, but I’m unsure of how to make it mutable. I’m using DataFrames v0.22.5.

Any help appreciated!

1 Like

I get the error when the CSV is empty, but as long as there’s at least one datapoint, I’m able to add to the DataFrame without issues. This doesn’t really solve the problem but hopefully it’s a useful workaround.

This seems to do it, thanks! Not sure why it doesn’t work with no data either.

@Sanjan_Das , this document will give you a preview of the possibilities to edit dataframes.

When there’s no data, CSV.jl has no way to determine the element type, so it sets it to Missing, so the column can only be pushed missing entries.

6 Likes

Yeah, this is what’s really going on. Note you can provide column types manually, so you would end up with an empty String column instead of Missing which would allow modifictaion.

2 Likes

I wonder if CSV should give a warning in this case.
"No rows, could not determine types, defaulting to missing. Solve this via passing ...."
Or even an error.
It doesn’t seem like this is a super useful default behavour

3 Likes