Hello! I am wondering if I can use Artifact.toml
to automatically download static resources I’m using in my Julia package. The catch is that these are (uncompressed) binary files. Here is an minimal example:
[openflights]
lazy = true
git-tree-sha1 = "d5b01b5c2ff323f8cefb6922a80c8dc9bd2414bd"
[[openflights.download]]
url = "https://raw.githubusercontent.com/jpatokal/openflights/master/data/routes.dat"
sha256 = "bd373706238134f619c624c606dccc74c05c2582a977c489c81de501735f2390"
If I try to access the artifact, I get an error:
julia> rootpath = artifact"openflights"
Downloaded artifact: openflights
Downloading artifact: openflights
ERROR: /var/folders/7b/bcssbzdd57s5l6vb4jq6yshw0000gn/T/jl_mFK9QV-download.gz
/var/folders/7b/bcssbzdd57s5l6vb4jq6yshw0000gn/T/jl_mFK9QV-download.gz
Open ERROR: Can not open the file as [gzip] archive
ERRORS:
Downloaded artifact: openflights
ERROR: Unable to automatically install 'openflights' from '/Users/moe/Projects/Research/Lifted Inference/QuasiStableColors/Artifacts.toml'
All the examples I could find in the documentation refer to tar.gz
zipped folders. Is what I’m trying to do possible?
1 Like
Yes an artifact is always a tarball, though I believe compression other than gzip is supported, including no compression.
It is mainly intended to be used to download tarballs that you produced specifically to be an artifact, as opposed to be for downloading random files from the internet.
And even it it worked now, your artifact would likely break in the future because the URL points at a git repo that is likely to change. Artifacts are static. If you need a new version of the data, you need a new artifact.
Some options:
- Base.download the data somewhere. The somewhere could be a scratch space as in Scratch.jl.
- DataDeps.jl (though I’ve never used it personally).
- Ask the repo owner nicely to make a GitHub release, which will automatically produce a .tar.gz of the whole repo, and use that as an artifact. But you’ll need to do this whenever you want to update the data.
- Copy the data into a .tar.gz yourself and put it somewhere static on the internet.
4 Likes
Thanks for the detailed reply @cjdoris.
For my use case, accessing data from academic archives, it seems DataDeps.jl
is exactly what I needed. The data being static is a requirement for reproducibility.
(Thanks for pointing out the data is mutable in the example given, I gotta fix that.)
Would there be interest in a contribution to add general file support to Artifacts.toml
? I’d have to add this code to base Julia, correct?