Closing files when using CSV.Rows

I’m using CSV.Rows to iterate through a very large CSV file. In base Julia, files that are opened with open should be closed with close (or you can use the do syntax instead). However, after perusing the CSV.jl docs, there doesn’t appear to be a close method for file iterators created with CSV.Rows. How does CSV.jl ensure that files are closed when they’re done being used? Does it just rely on the files being closed when the CSV.Rows object gets garbage collected?

2 Likes

Cc: @quinnj

Sorry for the slow response here, and thanks @nalimilan for the ping. There’s a few different paths here depending on how you use CSV.Rows:

  • If you pass the file as a string, like CSV.Rows("filename.csv"), then CSV.jl will mmap the file by default, which indeed then relies on the CSV.Rows object getting garbage collected to unmmap the file buffer
  • You can also pass in your own opened file, like io = open("filename.csv"); CSV.Rows(io), which currently will “slurp” the entire file into a mmapped buffer to be used while reading rows. So you would need to close the io yourself afterwards

Hopefully that helps!

3 Likes

This applies also to CSV.write, correct?

I thought those would be immediately closed (since they are writing a full table, not a single row), but looking at its source code it does not appear to be so.

Reason for asking: debugging a problem and understanding if a file is closed immediately after the function call, or rather later when the interpreter has time to run the garbage collector would help tremendously.

Thanks!