Importing ZipFile.ReadableFile into DataFrame

I have used ZipFile to open a downloaded compound zipfile. I want to import several of the contained files into DataFrames.

ZipFile.Reader for IOStream(<file ./GBIF-Datasets/>) containing 16 files:

uncompressedsize method  mtime            name
        39310607 Deflate 2021-07-29 01-56 occurrence.txt
        28756841 Deflate 2021-07-29 01-56 verbatim.txt
        16349369 Deflate 2021-07-29 01-56 multimedia.txt
            1641 Deflate 2021-07-29 01-56 citations.txt
            2947 Deflate 2021-07-29 01-56 dataset/1bc719fd-c4e1-410f-b8c1-518cc1addcb5.xml
            1044 Deflate 2021-07-29 01-56 rights.txt
            3430 Deflate 2021-07-29 01-56 metadata.xml
           36912 Deflate 2021-07-29 01-56 meta.xml
ZipFile.ReadableFile(name=occurrence.txt, method=Deflate, uncompresssedsize=39310607, compressedsize=4638199, mtime=1.627487774e9)

I can find no examples or documentation on how to do this.

Two questions:

  1. Is ZipFile the right library to use? There seems to be zero documentation.
  2. how do I access the contained files and load them into a DataFrame. I tried IOBuffer but couldn’t find any way open the ZipFile.ReadableFile?


I found an example using My working example follows. I’d imagined an example in the imagined documentation.
It seems like a good API.

uri = "<>.zip"
f = download(uri)
z = ZipFile.Reader(f)
z_by_filename = Dict( => f for f in z.files)
df =["occurrence.txt"], DataFrame)
250-element Vector{String}:

In general this tutorial Julia-DataFrames-Tutorial/04_loadsave.ipynb at master · bkamins/Julia-DataFrames-Tutorial · GitHub tries to cover most of the standard cases of reading/writing data for DataFrames.jl.


Thanks, I am working my way through your tutorials, and enjoying them very much!

I was jumping ahead, as I am moving my software development from Python to Julia, and had some existing Python DataFrame scenarios I was keen to try.

This is week 2 of Julia for me, so I expect to be lost quite a bit. One thing that have to learn is how to navigate the libraries.

Could I have seen that the sub-files within the ZipFile would plug into CSV.Read() if was competent with the type system?