I’m very new to Julia Programming.
I have a folder of CSV files (14) that I join to one big data frame and I’m trying to save the big CSV. (rows - 262673020 x columns - 77) when I use CSV.write - I get this ERROR: BoundsError: attempt to access 4194304-element Array{UInt8,1} at index [1:4194305].
So I tried to save it into a feather file but I getting this ERROR: InexactError: trunc(Int32, 2147483662) - This error looks to be reaching some 32 max. but not sure why
I’m not sure what is going on just need some help understand what to do.
MacBook Pro - Atom/Juno
Package - Julia Version 1.5.2 - CSV 0.7.7 - DataFrames v0.21.8 - Glob v1.3.0 - Tables v1.1.0 - Feather v0.5.6 -
using Glob, CSV, Tables, Feather
fileDirectory = "location/CSV"
files = glob("*.csv", fileDirectory)
list_df = [DataFrame(CSV.read(f)) for f in files]
Join_DF = join(list_df[3], list_df[4], list_df[5], list_df[6], list_df[7], list_df[8], list_df[9], list_df[10], list_df[11], list_df[12], list_df[13], list_df[14], on = :INC_KEY, kind = :outer)
Feather.write("location/join_files.feather", Join_DF)
# ERROR: InexactError: trunc(Int32, 2147483662)
CSV.write("location/join_files.csv", Join_DF)
# ERROR: BoundsError: attempt to access 4194304-element Array{UInt8,1} at index [1:4194305].
ERROR: BoundsError: attempt to access 4194304-element Array{UInt8,1} at index [1:4194305]
Stacktrace:
[1] throw_boundserror(::Array{UInt8,1}, ::Tuple{UnitRange{Int64}}) at ./abstractarray.jl:541
[2] checkbounds at ./abstractarray.jl:506 [inlined]
[3] view at ./subarray.jl:158 [inlined]
[4] writecell(::Array{UInt8,1}, ::Int64, ::Int64, ::IOStream, ::Int64, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at /Users/.julia/packages/CSV/4GOjG/src/write.jl:147
[5] #64 at /Users/.julia/packages/CSV/4GOjG/src/write.jl:182 [inlined]
[6] macro expansion at /Users/.julia/packages/Tables/FXXeK/src/utils.jl:54 [inlined]
[7] eachcolumn at /Users/.julia/packages/Tables/FXXeK/src/utils.jl:48 [inlined]
[8] writerow(::Array{UInt8,1}, ::Base.RefValue{Int64}, ::Int64, ::IOStream, ::Tables.Schema{(
[9] #55 at /Users/.julia/packages/CSV/4GOjG/src/write.jl:80 [inlined]
[10] (::CSV.var"#62#63"{CSV.var"#55#56"{Bool,Tables.Schema{(
[12] open(::Function, ::String, ::String) at ./io.jl:323
[13] with at /Users/.julia/packages/CSV/4GOjG/src/write.jl:139 [inlined]
[14] #write#54 at /Users/.julia/packages/CSV/4GOjG/src/write.jl:73 [inlined]
[15] write(::Tables.Schema{(
[16] write(::String, ::DataFrame; delim::Char, quotechar::Char, openquotechar::Nothing, closequotechar::Nothing, escapechar::Char, newline::Char, decimal::Char, dateformat::Nothing, quotestrings::Bool, missingstring::String, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /Users/.julia/packages/CSV/4GOjG/src/write.jl:60
[17] write(::String, ::DataFrame) at /Users/.julia/packages/CSV/4GOjG/src/write.jl:53
[18] top-level scope at none:1
Feather
Stacktrace:
[1] throw_inexacterror(::Symbol, ::Type{Int32}, ::Int64) at ./boot.jl:558
[2] checked_trunc_sint at ./boot.jl:580 [inlined]
[3] toInt32 at ./boot.jl:617 [inlined]
[4] Int32 at ./boot.jl:707 [inlined]
[5] convert at ./number.jl:7 [inlined]
[6] setindex! at ./array.jl:847 [inlined]
[7] offsets(::Type{Int32}, ::Type{UInt8}, ::PooledArrays.PooledArray{Union{Missing, String},UInt32,1,Array{UInt32,1}}) at /Users/.julia/packages/Arrow/q3tEJ/src/lists.jl:300
[8] Arrow.NullableList{String,Int32,P} where P<:Arrow.AbstractPrimitive(::Type{UInt8}, ::PooledArrays.PooledArray{Union{Missing, String},UInt32,1,Array{UInt32,1}}) at /Users/.julia/packages/Arrow/q3tEJ/src/lists.jl:243
[9] NullableList at /Users/.julia/packages/Arrow/q3tEJ/src/lists.jl:251 [inlined]
[10] arrowformat at /Users/.julia/packages/Arrow/q3tEJ/src/arrowvectors.jl:242 [inlined]
[11] getarrow(::PooledArrays.PooledArray{Union{Missing, String},UInt32,1,Array{UInt32,1}}) at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:40
[12] write(::IOStream, ::DataFrame; description::String, metadata::String) at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:18
[13] #20 at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:32 [inlined]
[14] open(::Feather.var"#20#21"{String,String,DataFrame}, ::String, ::Vararg{String,N} where N; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at ./io.jl:325
[15] open at ./io.jl:323 [inlined]
[16] #write#19 at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:31 [inlined]
[17] write(::String, ::DataFrame) at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:31
[18] top-level scope at none:1