Problem with unzippig file and reading it with CSV

alejandromerchan · September 28, 2018, 9:21pm

I’m having issues now with a code that was working before and will appreciate any help to solve this, because despite some comments from @quinnj on the Slack, I still haven’t been able to figure it out.

I’m downloading some Zip packages from a FTP server using FTPClient.jl

ftp_init()
options = RequestOptions(hostname="transfer.cdpr.ca.gov/pub/outgoing/pur_archives/"
Zipped_file = ftp_get(options, name_of_file.zip)
ftp_cleanup()

These Zip files contain about 76 compressed files, most of them .txt, with information I need regarding chemical use in agriculture in California. As extra information, this is the only way to obtain the data, as they don’t have a public API. Also, most of the files are not needed, so I prefer not to save them on disk, so I have a function that will unzip the files on memory, read their names and only open whatever file was requested.

using ZipFile, CSV, DataFrames

# Zipped_files.files is an IO buffer
unzipped_file = ZipFile.Reader(Zipped_file.files)
# unizipped file opens as expected and a list of all 70 plus file can be seen
file_of_interest = CSV.read(unzipped_file.files[index_file__of_interest])

This used to work but I believe that after the CSV upgrade something broke it and I get the following error:

Error: type ReadableFile has no field mark

As mentioned before, @quinnj told me on Slack that the problem appears to be that “ZipFile.Reader does not fully implement the normal IO methods”, causing CSV to fail. He also suggested reading the file into a byte array. I’ve spent some time trying to get that suggestion to work and haven’t been able to figure it out, so any help is welcome. Also, if there’s any other (better?) way to deal with zip files, I’ll also would like to know.

y4lu · September 29, 2018, 4:43am

Hopefully there’s a readstring() function handy

filebuffer = IOBuffer(readstring(unzipped_file.files[index_file__of_interest]));
file_of = CSV.read(filebuffer);

alejandromerchan · September 29, 2018, 5:35am

Thank you very much, that work. I had to change read string to read, because read string was deprecated. Otherwise seems to be exactly what I wanted.

Topic		Replies	Views
CSV ZipFile read example broken in Julia 1.11.0 General Usage	10	296	October 17, 2024
Open zip file with non UTF-8 encoding Data zip	1	876	August 5, 2022
Reading .csv.gz with CSV does not find readavailable(::GZipStream) Data csv	4	772	August 28, 2019
Importing ZipFile.ReadableFile into DataFrame Data dataframes , zip	3	893	July 31, 2021
How to read a compressed CSV file? New to Julia	11	4887	January 17, 2019

Problem with unzippig file and reading it with CSV

Related topics