Download decompressed .zip-archive from url

maximikos · December 11, 2023, 3:42pm

Hi,

I would like to extract and download files from a .zip-archive stored online (here) without having to download intermediate files. The .zip-archive consists of .txt and .json files and is nested.

I have found two references that work for .tar-files (on Discourse and reddit) but I cannot figure out how I would have to do it for .zip-folders. In the reddit post, the package UrlDownload.jl was suggested; this does not work in this case, returning multiple warnings: “Data format unknown is not supported.”

So far, I have been trying multiple variations along the lines of something like:

using HTTP, ZipFile

unzip_from_url(link, dir) = HTTP.open("GET", link) do io
    zarchive = ZipFile.Reader(io)
    for f in zarchive.files
        FileName = split(f.name, "/")
        DirName = joinpath(FileName[Not(end)]...)
        FilePath = joinpath(dir, DirName)
        if FileName[end] == ""
            mkdir(FilePath)
        else
            mkpath(FilePath)
            p = joinpath(dir, FileName...)
            write(p, read(f))
        end
    end
    close(zarchive)
end

The reason for why I don’t want to download the .zip-archive first and then extract it locally is that there are multiple such archives at the above url, each with around 700 MB and for a single year, and I don’t want to have to store both the zipped folders and the extracted ones.

mrufsvold · December 11, 2023, 6:11pm

You might benefit from ZipStreams.jl!

Topic		Replies	Views
Download/unzip a .zip file General Usage zip	3	3408	August 11, 2021
How to download, extract and import a zipped or tgz csv file from internet? Data question , dataframes , http , zip	1	1515	February 24, 2022
Extract a ZipFile New to Julia question , first-steps	4	715	January 19, 2021
How to read files from a compressed file (zip/gz) lazily? New to Julia question , lazy-evaluation , zip	3	2445	January 14, 2021
Zip folder using Julia General Usage question , filesystem , zip	8	1904	January 13, 2023

Download decompressed .zip-archive from url

Related topics