Best practice for storing data in Packages

An alternative to Artifacts is DataDeps.jl.
Either a normal datadep, with the data stored externally, e.g. on FigShare or Zenodo.
Or a ManualDataDep which will work for data stored in <project>/deps/data
(or you can put instructions on how to manually load the data.)

DataDeps lets you avoid worrying about where the data is stored,
because instead of writing things like ./../data/GoodData
you write datadep"GoodData" and it resolves to a string tht is the file path
(can also do datadepp"GoodData/myfile.csv etc)

IIRC you can do similar with artifacts using artifact"GoodData" but I am not 100% sure

There are number of pros and cons between Artifacts and DataDeps.
One pro of DataDeps is it works with julia 1.0.x (the LTS) not just with 1.3+
the others are around being more flexible for transport (can use a secure download e.g. AWSS3.jl, GoogleDrive.jl), and have post-fetch methods for unpacking random archieves (not just tarballs)
Downside of DataDeps is Artifacts use content addressing which is really clever and means it is basiclly impossibly to run into a name collision.
Artifacts also know how to clean themselves up when not needed anymore

2 Likes