Does somebody have the same error?
How to deal with it.
Dataframe is quite big: 339000*18 (String type).
Is the problem connected to size?
Thank you in advance.
Does somebody have the same error?
How to deal with it.
Dataframe is quite big: 339000*18 (String type).
Is the problem connected to size?
Thank you in advance.
how much memory do you have and how large is the txt file?
I have no idea, how to check memory under the dataframe. It can be around 1Gb.
I mean your computer; oh wait, I see what you saying, try this
https://docs.julialang.org/en/v1/base/base/#Base.sizeof-Tuple{Type}
My computer memory is 32gb. Itâs for sure enough.
The dataframe size 1.3gb
Is it too big?
should be fine, pinning @quinnj since I donât see any similar issue on github
Can you share the full error/stacktrace youâre seeing? Can you share the code here to see how youâre calling CSV.write?
Yes, Here is it.
Additionally, there is no error for 0.8Gb DataFrame, but there is the error for 1.3GB DataFrame
CSV.write(ât2.csvâ,df2)
ERROR: ReadOnlyMemoryError()
Stacktrace:
[1] + at .\int.jl:53 [inlined]
[2] writecell(::Array{UInt8,1}, ::Int64, ::Int64, ::IOStream, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:305
[3] #64 at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:182 [inlined]
[4] macro expansion at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\utils.jl:54 [inlined]
[5] eachcolumn at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\utils.jl:49 [inlined]
[6] writerow(::Array{UInt8,1}, ::Base.RefValue{Int64}, ::Int64, ::IOStream, ::Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}}, ::Tables.ColumnsRow{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}}, ::Int64, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:180
[7] (::CSV.var"#62#63"{CSV.var"#55#56"{Bool,Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}},Tables.RowIterator{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}},CSV.Options{UInt8,UInt8,Nothing,Tuple{}},NTuple{18,Symbol},Int64,Int64,Array{UInt8,1}}})(::IOStream) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:80
[8] open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::CSV.var"#62#63"{CSV.var"#55#56"{Bool,Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}},Tables.RowIterator{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}},CSV.Options{UInt8,UInt8,Nothing,Tuple{}},NTuple{18,Symbol},Int64,Int64,Array{UInt8,1}}}, ::String, ::Vararg{String,N} where N) at .\io.jl:298
[9] open(::Function, ::String, ::String) at .\io.jl:296
[10] with at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:139 [inlined]
[11] write#54 at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:73 [inlined]
[12] write(::Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}}, ::Tables.RowIterator{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}}, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:68
[13] write#53(::Char, ::Char, ::Nothing, ::Nothing, ::Char, ::Char, ::Char, ::Nothing, ::Bool, ::String, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CSV.write), ::String, ::DataFrame) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:60
[14] write(::String, ::DataFrame) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:53
[15] top-level scope at none:0
HmmmâŚnothing sticks out at first glance; is there any way you could share the dataset w/ me? Perhaps compressed and shared via something like http://ge.tt/? Even if just privately to me?
I have found the row, which gives the error.
Something is strange since DataFrame memory size (only 1 row) is 0.8GB, but I check every column and the maximum size is string column 4MB.
CSV.write(ât4.csvâ,df4)
ERROR: ArgumentError: âDataFrameRow{DataFrame,DataFrames.Index}â iterates âStringâ values, which donât satisfy the Tables.jl Row-iterator interface
Stacktrace:
[1] invalidtable(::DataFrameRow{DataFrame,DataFrames.Index}, ::String) at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\tofromdatavalues.jl:34
[2] iterate at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\tofromdatavalues.jl:40 [inlined]
[3] write#57(::Bool, ::Bool, ::Array{String,1}, ::typeof(CSV.write), ::Nothing, ::Tables.IteratorWrapper{DataFrameRow{DataFrame,DataFrames.Index}}, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:96
[4] write(::Nothing, ::Tables.IteratorWrapper{DataFrameRow{DataFrame,DataFrames.Index}}, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:93
[5] write#53(::Char, ::Char, ::Nothing, ::Nothing, ::Char, ::Char, ::Char, ::Nothing, ::Bool, ::String, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CSV.write), ::String, ::DataFrameRow{DataFrame,DataFrames.Index}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:60
[6] write(::String, ::DataFrameRow{DataFrame,DataFrames.Index}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:53
[7] top-level scope at none:0
I check, It seems, CSV canât write DataFrame with column string more than 4mb
Is it a bug?
There shouldnât be any limitation like this.
Where is better to report about this bug?
You can also try CSVFiles.jl:
using CSVFiles
df |> save("foo.csv")
It has a known problem with very large number of columns, but that threshold is at a couple of hundred columns, so the 18 you have should be no problem at all. Beyond that there shouldnât be any limits on size.