Using CSV.read() to import data from a data input file into a DataFrame

Yes, the 5th column worked.

Just to show what Peter meant:

julia> using DataFrames, CSV

julia> data = """
       1511805.54010775,-3631401.61267572,134,14.0385102411102,1.58792636637882
       1511880.54010775,-3631401.61267572,135,14.0623467964249,1.63997535275912
       1511955.54010775,-3631401.61267572,136,13.8401810265915,1.48364309144577
       1511730.54010775,-3631476.61267572,248,14.3080113191324,1.8129535268525
       1511805.54010775,-3631476.61267572,249,14.2710283891669,1.78019664985144
       1511880.54010775,-3631476.61267572,250,14.2895778872987,1.79291534883922
       1511955.54010775,-3631476.61267572,251,13.9846463218381,1.55426381397442
       1512030.54010775,-3631476.61267572,252,13.1978960888864,1.12702376308933
       1512105.54010775,-3631476.61267572,253,12.0766192975667,0.867366055025797
       """
"1511805.54010775,-3631401.61267572,134,14.0385102411102,1.58792636637882\n1511880.54010775,-3631401.61267572,135,14.0623467964249,1.63997535275912\n1511955.54010775,-3631401.61267572,136,13.8401810265915,1.48364309144577\n1511730.54010775,-3631476.61267572,248,14.3080113191324,1.8129535268525\n1511805.54010775,-3631476.61267572,249,14.2710283891669,1.78019664985144\n1511880.54010775,-3631476.61267572,250,14.2895778872987,1.79291534883922\n1511955.54010775,-3631476.61267572,251,13.9846463218381,1.55426381397442\n1512030.54010775,-3631476.61267572,252,13.1978960888864,1.12702376308933\n1512105.54010775,-3631476.61267572,253,12.0766192975667,0.867366055025797\n"

julia> CSV.read(IOBuffer(data), DataFrame; delim = ",", header = false)
9×5 DataFrame
 Row │ Column1    Column2     Column3  Column4  Column5  
     │ Float64    Float64     Int64    Float64  Float64  
─────┼───────────────────────────────────────────────────
   1 │ 1.51181e6  -3.6314e6       134  14.0385  1.58793
   2 │ 1.51188e6  -3.6314e6       135  14.0623  1.63998
   3 │ 1.51196e6  -3.6314e6       136  13.8402  1.48364
   4 │ 1.51173e6  -3.63148e6      248  14.308   1.81295
   5 │ 1.51181e6  -3.63148e6      249  14.271   1.7802
   6 │ 1.51188e6  -3.63148e6      250  14.2896  1.79292
   7 │ 1.51196e6  -3.63148e6      251  13.9846  1.55426
   8 │ 1.51203e6  -3.63148e6      252  13.1979  1.12702
   9 │ 1.51211e6  -3.63148e6      253  12.0766  0.867366

Here I’m just reading the data from an IOBuffer, but equally I can write this out to a file and read back in from disk:

# df is the DataFrame I read in above
julia> CSV.write("test.csv", df)
"test.csv"

julia> CSV.read("test.csv", DataFrame; delim = ",")
# Same result as above (although the file written to disk has a header now, so I removed header=false here

In my function readdata() I have:
df = DataFrame(a= [1, 2], b = [3, 4])
CSV.write(“file.csv”, df)
df_new = CSV(“file.csv”, DataFrame)
Data = CSV.read(filename, DataFrame; delim=‘,’, header=false)

When I run this I get a lot of text from ‘abstractdataframe.jl’ where one line is highligjhted:
“syntax df[column] is not supported use df[!, column] instead.”

‘filename’ refers to the datafile I still want to import into a DataFrame called ‘Data’

Please make sure to enclose your code in triple backticks ``` to ensure it’s legible.

The error you are reporting cannot be caused by the lines of code you posted, as you are not indexing a DataFrame. This is how to get the error:

julia> df = DataFrame(rand(2, 2), :auto)
2×2 DataFrame
 Row │ x1        x2       
     │ Float64   Float64  
─────┼────────────────────
   1 │ 0.600275  0.241023
   2 │ 0.542208  0.758831

julia> df[:x1]
ERROR: ArgumentError: syntax df[column] is not supported use df[!, column] instead

which is not what you’re doing in your code. When I run your code I get:

julia> df_new = CSV("file.csv", DataFrame)
ERROR: MethodError: objects of type Module are not callable

from the third line of what you posted - you are doing CSV() which is trying to use CSV (the name of a module) as a function.

I think you have much more fundamental problems then how many columns you are reading in, your earlier posts in this thread about getting comments back from your code, and the error you just posted which is unconnected to your code suggest that you’re not really doing what you think you’re doing.

2 Likes

Thank you Nils,
The first three line of the code that I posted were suggested by somebody else of this forum. I do not understand what their purpose is. From what you say I understand it’s better to dselete them.

When I do that I get the message again that “syntax df[column] is not supported use df[!, column instead]”.
Could that be caused by the next line in my code:

x = Data[:1]

I’ve been using this with julia-1.0

Yes. That is an expected error.

Someone above mentioned that this is an expected error already.

My code is now:

Data = CSV.read(filename, DataFrame; delim=',', header =false)
df = Data
x = df[!, 1]
y = df[!, 2]
# nr = df[!, 3]        grid point identifier
z_pred = df[!, 4]
s2 = df[!, 5]
N = length(x)
println("Grid size : ", N)

And it WORKED. What a relief!
I tghank you guys for all your patience and help.

1 Like

well done :slight_smile: