Saving outputs in CSV file

This is my first time to use CSV to save outputs and i’m working on a small example to get used to it but it look like i’m still having troubles in it, here is the working example:

using CSV 
using DataFrames
function mar(x,y)
    open("myfile.csv", "a")
    for i in 1:10
        global s=x*i+y
        
    end
    
end

then when i input this:

x=2
y=4
mar(x,y)
CSV.write("myfile.csv",x=x,y=y,s=s)

(generic function with 1 method)
but when i try to read the file it shows me that the file is empty

CSV.read("myfile.csv")

0 rows × 0 columns

i don’t know what is the thing that i’m doing wrong in here.

I answered your question previously on StackOverflow here.

Your main issue here is that you missed out a pair of parentheses when copying my answer:

CSV.write("out.csv", (s = s, x = x, y = y)) 

is what I put on SO, while you have

CSV.write("out.csv", s = s, x = x, y = y) 

Note that in my answer, there are parentheses around s=s, x=x, y=y - this will construct a NamedTuple, which satisfies enough of the Tables.jl interface to be written out to csv by CSV.write.

There is an additional issue here compared to your SO post, which is that your x and y are scalars, while your s will be a vector. To construct the table you therefore need to decide what you want to put in the x and y columns. E.g. to repeat the x and y values in all rows you would do:

(s = s, x = fill(x, 10), y = fill(y, 10))

There are also a few unrelated issues in your code as well, which make me think that it’d be worthwhile for you to spend a day or two reading through the excellent Julia manual to get the hang of the language basics:

  • You open the csv file in your mar function, but this actually isn’t necessary if you want to use CSV.write in the way I suggested;
  • You have a global annotation in your loop, which in the context of your posted snippet doesn’t make sense, as s is not defined anywhere outside your function.
  • You are using DataFrames, although this isn’t actually necessary in your case - CSV.write can handle a simple NamedTuple for writing out data, which you can construct without relying on any external dependencies.
  • Your function doesn’t return anything - this might be related to the point above, as you might be attempting to structur your program as defining s outside the function and then use mar to mutate the global variable s. This is not a great way of going about things for many reasons, not least because non-constant globals are terrible for performance in Julia. Try writing your function such that they only depend on their arguments and return whatever value you want to calculate rather than mutate a global state.

Example:

using CSV

mar(x, y) = x .* collect(1:10) .+ y

x = 2; y = 4

s = mar(x, y)
10 Likes

this example in general is just a working example before using CSV in my actual code and I just want to know how can I use CSV in case where we have a function because it’s not clear for me yet, maybe my example code wasn4t that clear enough, but the main point is for me to learn how can I use it if I have a function.

Sorry but I still don’t understand your problem - what is keeping you from just using CSV.write("out.csv", x) where x is whatever NamedTuple/other table you want to write out to file in a function?

2 Likes

I put

CSV.write("out.csv", s = s, x =fill(x, 10), y = fill(y, 10)) 

just right after the end of the loop and I close the function after I try to read the file
'CSV.read(“out.csv”)` and I get this

ArgumentError: "out.csv" is not a valid file

Have another look at what I wrote in my original reply - you need parentheses around s=s, x=fill(x,10), y=fill(y,10).

s = s, x = fill(x, 10), y = fill(y, 10)

is not the same as

(s = s, x = fill(x, 10), y = fill(y, 10))

See for yourself:

julia> x = 2; y = 4;

julia> s = x .* (1:10) .+ y;

julia> s = s, x = fill(x, 10), y = fill(y, 10)
ERROR: syntax: "10" is not a valid function argument name
Stacktrace:
 [1] top-level scope at REPL[19]:1

julia> (s = s, x = fill(x, 10), y = fill(y, 10))
(s = 6:2:24, x = [2, 2, 2, 2, 2, 2, 2, 2, 2, 2], y = [4, 4, 4, 4, 4, 4, 4, 4, 4, 4])

So the reason CSV.read errors is that you never actually write a csv file with what you’re doing, because you’re not constructing a NamedTuple when you leave out the brackets.

Again I think it would be helpful to go through the Julia documentation to learn a bit more about the basics of the language.

4 Likes

even if I put the parentheses I still get the same error, like I said I know how to use it if we don4t have a function but in this case where I have a function I really don’t get it:
you can see here :

 julia> using CSV

julia> function mar(x,y)
           for i in 1:10
               s=x*i+y

           end
           CSV.write("out.csv",(s = s, x =fill(x, 10), y = fill(y, 10)))
       end
mar (generic function with 1 method)

julia> CSV.read("out.csv")
┌ Warning: `CSV.read(input; kw...)` is deprecated in favor of `CSV.read(input, DataFrame; kw...)
└ @ CSV C:\Users\Administrator\.julia\packages\CSV\vohbW\src\CSV.jl:41
ERROR: ArgumentError: "out.csv" is not a valid file
Stacktrace:
 [1] CSV.Header(::String, ::Int64, ::Bool, ::Int64, ::Nothing, ::Int64, ::Bool, ::Nothing, ::Nothing, ::Bool, ::Nothing, ::Nothing, ::Array{String,1}, ::String, ::Nothing, ::Bool, ::Char, ::Nothing, ::Nothing, ::Char, ::Nothing, ::Nothing, ::UInt8, ::Array{String,1}, ::Array{String,1}, ::Nothing, ::Nothing, ::Dict{Type,Type}, ::Nothing, ::Float64, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool) at C:\Users\Administrator\.julia\packages\CSV\vohbW\src\header.jl:92
 [2] #File#26(::Int64, ::Bool, ::Int64, ::Nothing, ::Int64, ::Bool, ::Nothing, ::Nothing, ::Bool, ::Nothing, ::Nothing, ::Array{String,1}, ::String, ::Nothing, ::Bool, ::Char, ::Nothing, ::Nothing, ::Char, ::Nothing, ::Nothing, ::UInt8, ::Array{String,1}, ::Array{String,1}, ::Nothing, ::Nothing, ::Dict{Type,Type}, ::Nothing, ::Float64, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Type{CSV.File}, ::String) at C:\Users\Administrator\.julia\packages\CSV\vohbW\src\file.jl:216
 [3] CSV.File(::String) at C:\Users\Administrator\.julia\packages\CSV\vohbW\src\file.jl:216
 [4] #read#75(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CSV.read), ::String, ::Nothing) at C:\Users\Administrator\.julia\packages\CSV\vohbW\src\CSV.jl:44
 [5] read at C:\Users\Administrator\.julia\packages\CSV\vohbW\src\CSV.jl:40 [inlined] (repeats 2 times)
 [6] top-level scope at REPL[45]:1

I don’t know what i’m still missing

In the code above, you are not executing the function mar, only defining it.

3 Likes

that’s my bad i forgot to put it, but if i execute it and then tried to read the file it’ll give me :
0 rows × 0 columns which means that nothing has been written in the file.

I have no PC at hand to check, but from the code I guess that s is a scalar, whereas the other elements in the NamedTuple are vectors.
Could this cause the issue?

1 Like

s is not defined at the point where you call write. It is also a scalar, which doesn’t work with CSV.write. The following works fine. Hopefully you can adapt it to what you need.

function mar(x,y)
    for i in 1:10
        s=x*i+y
    end
    CSV.write("out.csv",(s = [s], x = [x], y = [x]))
end
2 Likes

Yeah that works fine except it only shows me the last output when i=10, even if i put CSV.writebefor the end of the loop it still shows me the last output only

julia> function mar(x,y)
           for i in 1:10
               s=x*i+y
           end
           CSV.write("out.csv",(s = [s], x = [x], y = [y]))
       end
mar (generic function with 1 method)

julia> mar(10,2)
"out.csv"

julia> CSV.read("out.csv")
1×3 DataFrame
│ Row │ s     │ x     │ y     │
│     │ Int64 │ Int64 │ Int64 │
├─────┼───────┼───────┼───────┤
│ 1   │ 110   │ 10    │ 2     │

What would you like to do? Save all x, y, and ś values? Right now you’re throwing all the intermediate values away. s at the end is just the last one.

yes i want to save all the x,y and s i know the s is the only one that will change with each iteration.

On mobile, so I can’t give you example code on this one. First, keep track of each value of s. Right now you do s = … each time, which throws away the last value of s. Then, copy x and y as many times as s is long. You can use fill for that, like in a previous example above.

@nilshg’s solution should work for you. Why doesnt it?

You need the keyword argument append = true in CSV.write. The default is false, which overwrites the file each time.

1 Like

I’m not sure, but it sounds like @marouane isn’t concerned with writing multiple times to the same file, but rather that the call to write is only giving them a single row, when they expect it to be 10 rows because of the for-loop.

Well, unless CSV.write is happening inside the loop, they shouldn’t expect multiple rows.

@marouane put the CSV.write call inside the loop with append = true.

Realistically, note that a Vector of NamedTuples can be saved to a file using CSV.jl. So you can also collect all the named tuples from your for loop into a vector and save that with a single CSV.write call.

Is this an MWE and you want to do something more complicated in reality? If not there’s no reason to call CSV.write multiple time with append in the loop, rather just collect all the stuff you want to write out in vectors and put them together into a NamedTuple like suggested originally.

E.g.

function mar(x,y)
    s = zeros(10)
    for i in 1:10
        s[i] = x*i + y
    end
    CSV.write("out.csv",(s = s, x = fill(x, 10), y = fill(y, 10)))
end
2 Likes

but still even using the append=true how come i’m still missing the first value where i=1

using CSV
function mar(x,y)
    for i in 1:10
        s=x*i+y
        CSV.write("save.csv",(s = [s], x=[x], y=[y]),append=true) 
    end
end

x=2
y=4
mar(x,y)

CSV.read("save.csv")

9 rows × 3 columns

6 2 4
Int64 Int64 Int64
1 8 2 4
2 10 2 4
3 12 2 4
4 14 2 4
5 16 2 4
6 18 2 4
7 20 2 4
8 22 2 4
9 24 2 4