Hi all,
I’m working on bringing faster / more flexible compression to JLD2
.
Currently you can do @save "test.jld2" {compress=true} a b c
and JLD2 will then compress sufficiently large arrays using CodecZlib
.
This works reasonably well but it gets slow for large datasets and there are
faster compression algorithms available such as Blosc.jl
and CodecLz4.jl
.
As an improvement, I think it would be neat to change the default compression algorithm
but also let the user pass which algorithm should be used.
My current best idea is to allow passing Symbol
s as arguments to the calls e.g.
@save "test.jld2" {compress=:lz4}
jldopen("test.jld2", "w"; compress=:blosc) do .... end
but this does not grant full access to all features of the compression libraries.
Another question concerns the libraries themselves:
Do I add all options as dependencies?
Currently, I’m trying to dynamically load them similar to what FileIO
does
but I’m running into worldage problems.
ERROR: MethodError: no method matching CodecLz4.LZ4FrameCompressor()
The applicable method may be too new: running in world age 27829, while current world is 27830.
Closest candidates are:
CodecLz4.LZ4FrameCompressor(; kwargs...) at /home/jonas/.julia/packages/CodecLz4/2JFgC/src/frame_compression.jl:30 (method too new to be called from this world context.)
The code is at Compression with TranscodingStreams API by JonasIsensee · Pull Request #264 · JuliaIO/JLD2.jl · GitHub
What are your thoughts?
Better Suggestions for the API?
Any experience with world-age problems?
Opinions on dependencies vs. dynamic loading?
Best,
Jonas
PS:
None of this will break reading old files!