Hi, I have some code which I don’t control the output of, which generates a whole bunch of files per unique index. We have a restriction on the number of files (inodes) on our filesystem, so it would be nice to make a tarball which can later be unpacked to analyze data from an index of interest. In addition, some of the files per index are quite large, so compression with gzip is also attractive. While I could do this with a bash
script, is this a good way to do it in julia? Here’s an MWE
using Tar, GZip, DelimitedFiles
## Dummy section to generate data similar to mine
# set of unique data indices
lines = [1, 2]
# writing index_a.txt to index_j.txt per index
for l in lines
map(x->writedlm(string(l, pad=2)*"_"*x*".txt", rand(10)), [string(x) for x in (String(UInt8.(97:106)))])
end
# make a directory to store tarballs per index
mkdir("tarfiles")
## Actual section:
# How I would actually read, tar and gzip files similar to mine
files = readdir()
for l in lines
idx = occursin.(string(l, pad=2), files)
tarname = joinpath(pwd(),"tarfiles/")*string(l, pad=2)*".tar"
Tar.create(pwd(), tarname) do path
basename(path) in files[idx]
end
io = open(tarname)
fh = GZip.open(tarname*".gz", "w")
write(fh, io)
close(fh)
rm(tarname)
end