I’m having issues now with a code that was working before and will appreciate any help to solve this, because despite some comments from @quinnj on the Slack, I still haven’t been able to figure it out.
I’m downloading some Zip packages from a FTP server using FTPClient.jl
ftp_init()
options = RequestOptions(hostname="transfer.cdpr.ca.gov/pub/outgoing/pur_archives/"
Zipped_file = ftp_get(options, name_of_file.zip)
ftp_cleanup()
These Zip files contain about 76 compressed files, most of them .txt, with information I need regarding chemical use in agriculture in California. As extra information, this is the only way to obtain the data, as they don’t have a public API. Also, most of the files are not needed, so I prefer not to save them on disk, so I have a function that will unzip the files on memory, read their names and only open whatever file was requested.
using ZipFile, CSV, DataFrames
# Zipped_files.files is an IO buffer
unzipped_file = ZipFile.Reader(Zipped_file.files)
# unizipped file opens as expected and a list of all 70 plus file can be seen
file_of_interest = CSV.read(unzipped_file.files[index_file__of_interest])
This used to work but I believe that after the CSV upgrade something broke it and I get the following error:
Error: type ReadableFile has no field mark
As mentioned before, @quinnj told me on Slack that the problem appears to be that “ZipFile.Reader does not fully implement the normal IO methods”, causing CSV to fail. He also suggested reading the file into a byte array. I’ve spent some time trying to get that suggestion to work and haven’t been able to figure it out, so any help is welcome. Also, if there’s any other (better?) way to deal with zip files, I’ll also would like to know.