DelimitedFiles:writedlm usage

with using DelimitedFiles
What would be the equivalent of matlab command dlmwrite(‘filename.csv’,arrayname,‘delimiter’,’,’,’-append’) ?
Basically, I need to write to the same csv file and append the existing data.

writedlm(“filename.csv”,arrayname,’,’) writes to an csv file, but it does not append. The help mentions some “opns” presumably options but i could not find a single example. I tried to see csv.jl package, but it seems equally lacking any documentation.

Many thanks

You should look into the CSV.jl package. Example usage would be CSV.write("filename.csv", arrayname, delim=",")

But will it APPEND? I looked at it - yes it says it appends, but it requires data to be in some special form.

All I want is just to write an array of floats , say randn(3,5) into an csv file several times.

I would have thought that it is a reasonable request that a language should allow to do this without any bells and whistles, data conversions etc… Is there any package which can write such an array into a file and append?

Yes. There is also an append keyword argument. See documentation here: https://juliadata.github.io/CSV.jl/stable/#CSV.write

CSV.write("filename.csv", arrayname, delim=",", append=true)

Edit: I see you are trying to write a vector to the file. I am pretty sure CSV.jl expects the data in a table format. The easiest way would be to simply convert it to DataFrame.

a = randn(5, 3) 
CSV.write("filename.csv", DataFrame(a), delim=",", append=true)

Data conversion? say I have array = randn(5,3)

It refuses to do anything with this array.

Convert to a dataframe. There are other IO packages that may work directly with an array, but maybe someone else can chime in on that.

a = randn(5, 3) 
CSV.write("filename.csv", DataFrame(a), delim=",", append=true)

Look at https://docs.julialang.org/en/v1/stdlib/DelimitedFiles/

I’m guessing that you should replace the “w” (write) with “a” (append). (Not tested)

1 Like

Thanks for this - done conversion and it works!

I did not mean to be grumpy, and most grateful, it is just the whole thing does not stop to surprise: key functions are delegated to obscure packages, main packages do not have obvious things, many things are counter intuitive… Who would have thought that you need to install like three packages and write a lot of idiotic text to print a float variable to a file…

You can also rethink what you are trying to do. Why would you choose a CSV format for this task with small arrays? You are coming from another language with a limited view of the alternatives. Try to ask the more general question: I am trying to do X, and so far I considered implementing it with Y. People here will likely say: X is better implemented with method Z instead.

In your use case, my guess is that there are many different formats implemented in packages that are more appropriate than CSV for appending small arrays to a file. Try to unlearn bad programming habits you learned in other programming environments, and keep in mind that many people are here to help with the transition.

Writing delimited files is hardly a core feature (compared to, say, +), and DelimitedFiles is not obscure at all — it is part of the standard libraries.

You only need DelimitedFiles, which you do not need to install (not that there would be a problem with that, since it is really easy to install packages), and the solution is a one-liner.

1 Like

Hi Juliohm

I actually have files growing to about 5GB in csv form, and this is why I append relatively small chunks (1000 lines) to it at at every step of a loop. May be other formats work better with this, but I already have a collection of files like this.

I agree that in the best possible world one needs to learn new language as a new langage to appreciate its features, and I wish I could do this, but many of us need something done quickly, do no have time for any reading, unfortunately, and go for julia as it speeds up noticeably (in my experience), so here is: let me rewrite the existing code somehow so that it works and it works faster, and I will sort out nice things later on…

hence the absence of documentation when you search for a very particular thing is frustrating and the forum is the only option.

Hi Tamas_Papp

If I have an opportunity to ask, which file format is the core feature in Julia, so that it will help me to write an appendable file, then read from it from a particular line? If not csv, then what? There must be something like this…

There is no core feature file format in Julia, even though I don’t know what that’s supposed to mean. The Julia devs are no overlords who decide what is and is not supported. The Julia code that you write works exactly like the Julia code in Base or the standard library.

If you want to append something to a csv file, you can either use DelimitedFiles or CSV.jl.
For DelimitedFiles, you could use “a” instead of “w” in open:

using DelimitedFiles

open("delimited_file.csv", "a") do io
    for i in 1:4
        writedlm(io, [i i i], ',')
    end
end
2 Likes

Daniel
Thanks a lot, this should work: the append command is given on the level of “open” not “write”.
just for the record, it seems to me that, csv.write with this transformatio to datatables works quite slow. But this is naked eye observation

To second @daniel, the no specific file format is treated specially in Julia: support is mostly in packages.

Unless you have an overwhelming need for something like HDF5 (which also has a package), I would go with CSV.

Also, while parsing CSV can be tricky, writing it should be fast, especially with DelimitedFiles. Here is an example:

using DelimitedFiles
A = reshape(1:4, 2, :) # we write this twice
open(io -> writedlm(io, A, ','), "/tmp/test.csv", "w") # write
open(io -> writedlm(io, A, ','), "/tmp/test.csv", "a") # append
1 Like

This is probably compilation time you’re noticing. CSV.jl speed is pretty competitive, see https://www.queryverse.org/benchmarks/csvreaders/ (the link says please don’t cite yet but I’m a rebel). It may be the case though that you run into some weird corner case with your data.

May be :slight_smile: I would not insist, but adding one package instead of two (csv + dataframes) is always better!

Anyway, many thanks, I made it working, and direct “translation” of a MATLAB code into Julia code gave me gain 2.5 times! So instead of 5 weeks it will only run 2!

I suspect “writing in Julia” rather than “translate” would do miracles.

1 Like

If you do some profiling to find the part that’s taking all the time and post it, someone would probably take a look and teach you a bit about writing fast Julia I’m the process.

1 Like