I have a goal similar to the following thread, but my data is gzipped text, not tiff. Retrieve data from Amazon S3?.
I want to read many 130M gzipped text files from s3 one by one, unzip each file, extract a regex match (to be stored later) and then discard the s3 file without ever writing to disk.
So I have this attempt:
for line in ZipFile.Reader(load(Stream(format"GZ", IOBuffer(s3obj))))
but I’m getting ERROR: LoadError: No applicable_loaders found for GZ
I also tried this variant:
for fname in ZipFile.Reader(FileIO.load(IOBuffer(obj)))
with this result: ERROR: LoadError: ArgumentError: Unrecognized RDA formatd��Yconll.paths.csv��14�r���{�}��b��P��ރ}mdIi��Yτ���t���/���������������o�����������ן�����?��?��_����_���?������_��� �$l"
Is there a way to do what I want?