Does somebody have the same error?
How to deal with it.
Dataframe is quite big: 339000*18 (String type).
Is the problem connected to size?
Thank you in advance.
Does somebody have the same error?
How to deal with it.
Dataframe is quite big: 339000*18 (String type).
Is the problem connected to size?
Thank you in advance.
how much memory do you have and how large is the txt file?
I have no idea, how to check memory under the dataframe. It can be around 1Gb.
I mean your computer; oh wait, I see what you saying, try this
https://docs.julialang.org/en/v1/base/base/#Base.sizeof-Tuple{Type}
My computer memory is 32gb. Itâs for sure enough.
The dataframe size 1.3gb
Is it too big?
should be fine, pinning @quinnj since I donât see any similar issue on github
Can you share the full error/stacktrace youâre seeing? Can you share the code here to see how youâre calling CSV.write
?
Yes, Here is it.
Additionally, there is no error for 0.8Gb DataFrame, but there is the error for 1.3GB DataFrame
CSV.write(ât2.csvâ,df2)
ERROR: ReadOnlyMemoryError()
Stacktrace:
[1] + at .\int.jl:53 [inlined]
[2] writecell(::Array{UInt8,1}, ::Int64, ::Int64, ::IOStream, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:305
[3] #64 at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:182 [inlined]
[4] macro expansion at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\utils.jl:54 [inlined]
[5] eachcolumn at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\utils.jl:49 [inlined]
[6] writerow(::Array{UInt8,1}, ::Base.RefValue{Int64}, ::Int64, ::IOStream, ::Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}}, ::Tables.ColumnsRow{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}}, ::Int64, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:180
[7] (::CSV.var"#62#63"{CSV.var"#55#56"{Bool,Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}},Tables.RowIterator{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}},CSV.Options{UInt8,UInt8,Nothing,Tuple{}},NTuple{18,Symbol},Int64,Int64,Array{UInt8,1}}})(::IOStream) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:80
[8] open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::CSV.var"#62#63"{CSV.var"#55#56"{Bool,Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}},Tables.RowIterator{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}},CSV.Options{UInt8,UInt8,Nothing,Tuple{}},NTuple{18,Symbol},Int64,Int64,Array{UInt8,1}}}, ::String, ::Vararg{String,N} where N) at .\io.jl:298
[9] open(::Function, ::String, ::String) at .\io.jl:296
[10] with at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:139 [inlined]
[11] write#54 at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:73 [inlined]
[12] write(::Tables.Schema{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{String,String,String,String,Date,Time,Date,Time,Int64,String,String,String,String,Int64,String,String,String,Int64}}, ::Tables.RowIterator{NamedTuple{(:Guid, :Id, :HeadLine, :Body, :VersionCreatedDate, :VersionCreatedTime, :FirstCreatedDate, :FirstCreatedTime, :TakeSequence, :MimeType, :PubStatus, :Language, :AltId, :MessageType, :Subjects, :Provider, :InstancesOf, :Urgency),Tuple{Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Date,1},Array{Time,1},Array{Date,1},Array{Time,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1},Array{String,1},Array{String,1},Array{String,1},Array{Int64,1}}}}, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:68
[13] write#53(::Char, ::Char, ::Nothing, ::Nothing, ::Char, ::Char, ::Char, ::Nothing, ::Bool, ::String, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CSV.write), ::String, ::DataFrame) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:60
[14] write(::String, ::DataFrame) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:53
[15] top-level scope at none:0
HmmmâŚnothing sticks out at first glance; is there any way you could share the dataset w/ me? Perhaps compressed and shared via something like http://ge.tt/? Even if just privately to me?
I have found the row, which gives the error.
Something is strange since DataFrame memory size (only 1 row) is 0.8GB, but I check every column and the maximum size is string column 4MB.
CSV.write(ât4.csvâ,df4)
ERROR: ArgumentError: âDataFrameRow{DataFrame,DataFrames.Index}â iterates âStringâ values, which donât satisfy the Tables.jl Row-iterator interface
Stacktrace:
[1] invalidtable(::DataFrameRow{DataFrame,DataFrames.Index}, ::String) at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\tofromdatavalues.jl:34
[2] iterate at C:\Users\vbaidin.julia\packages\Tables\FXXeK\src\tofromdatavalues.jl:40 [inlined]
[3] write#57(::Bool, ::Bool, ::Array{String,1}, ::typeof(CSV.write), ::Nothing, ::Tables.IteratorWrapper{DataFrameRow{DataFrame,DataFrames.Index}}, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:96
[4] write(::Nothing, ::Tables.IteratorWrapper{DataFrameRow{DataFrame,DataFrames.Index}}, ::String, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:93
[5] write#53(::Char, ::Char, ::Nothing, ::Nothing, ::Char, ::Char, ::Char, ::Nothing, ::Bool, ::String, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CSV.write), ::String, ::DataFrameRow{DataFrame,DataFrames.Index}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:60
[6] write(::String, ::DataFrameRow{DataFrame,DataFrames.Index}) at C:\Users\vbaidin.julia\packages\CSV\ztQqu\src\write.jl:53
[7] top-level scope at none:0
I check, It seems, CSV canât write DataFrame with column string more than 4mb
Is it a bug?
There shouldnât be any limitation like this.
Where is better to report about this bug?
You can also try CSVFiles.jl:
using CSVFiles
df |> save("foo.csv")
It has a known problem with very large number of columns, but that threshold is at a couple of hundred columns, so the 18 you have should be no problem at all. Beyond that there shouldnât be any limits on size.