Zip the csv file and help save the planet.
1 Like
Line containing:
360G-ACW-20070816
is invalid CSV, description starts with a "
and doesnβt close the quote:
1 31127
2 360G-ACW-20070816
3 Shedding Skins
4 "Shedding Skins project will generate a new body of work from the Selkie myths and stories. Maria Hayes will spend time on Bardsy Island and various other areas in the UK mainly the Celtic countries collating information about Selkies. There will also be some site-specific work on beaches as well as marketing and promotional material produced. Maria will also explore more into the digital realm of mixed media and film to see whether this could be a new avenue for her.
5 GBP
6 4916
7 2007-12-11
8 2022-04-29 11:42:21.612355+00
9 4
10 1
11 1
12 22244
13 13
14 62
15 5469
are the fields, notice field 4.
Disabling the quoting option might be a path forward.
After disabling quotes, the whole file parses:
CSV.read("newex.csv", DataFrame; quoted=false)
2 Likes
Thank you, Dan.
Iβm not sure how you figured this out. Using limit
, even with nthreads=1
doesnβt seem to get me anywhere close to the offending line.
My method, which read the file fine, retains the leading " in the resulting field. Doing it properly, your way, doesnβt.
I have a different problem.
PS
obviously, I hadnβt read all the messages
this cycle ends without raising any errors
for i in 1:nrow(df)
df[i,:]
println(i)
end
This raise the error: access to undefined reference
for i in 1:nrow(df)
df[i:i,:]
println(i)
end
1
2
...
25890
25891
25892
25893
25894
25895
25896
25897
25898
25899
25900
25901
25902
25903
25904
25905
25906
25907
ERROR: UndefRefError: access to undefined reference
julia> df[25908,:]
DataFrameRow
Row β id identifier title description currency amount_awarde β―
β Int64 String String String String String15 β―
ββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
25908 β 25872 360G-ACW-20021275 mosaic workshops. #undef GBP 4520 β―
11 columns omitted
julia> df[25908:25908,:]
ERROR: UndefRefError: access to undefined reference
another way to see where the problems lie
using DelimitedFiles
julia> readdlm("newex.txt", '\t')
ERROR: unexpected character ' ' after quoted field at row 25909 column 4
Stacktrace: