EDITED: Thanks to Elrod for politely pointing out a very obvious mistake in my code. I’m leaving the question open as even with the fix JLD2 is still significantly slower than HDF5.
Hi all,
I asked a question on StackOverflow here about the fastest way to save matrices of floats. I was directed to JLD2
and the docs page suggests that performance might be comparable to serialize
. I just put together a quick test for writing matrices, and found JLD2
to be about 4 times slower than serialize
and HDF5
. Is this expected timings, or is there a way to speed this up? I’m on julia v0.6 on Ubuntu 16.04, and just ran a Pkg.update()
before posting. My (very simple) test code follows:
using JLD2, FileIO, HDF5
function f_create_jld(N::Int)
dp = "/home/colin/Temp/"
for n = 1:N
fp = "$(dp)$(n).jld2"
x = randn(1000,1000)
@save fp x
rm(fp)
end
end
function f_create_dlm(N::Int)
dp = "/home/colin/Temp/"
for n = 1:N
fp = "$(dp)$(n).csv"
writedlm(fp, randn(1000,1000), ',')
rm(fp)
end
end
function f_create_h5(N::Int)
dp = "/home/colin/Temp/"
for n = 1:N
fp = "$(dp)$(n).h5"
h5write(fp, "G/D", randn(1000, 1000))
rm(fp)
end
end
function f_create_slz(N::Int)
dp = "/home/colin/Temp/"
for n = 1:N
fp = "$(dp)$(n)"
fid1 = open(fp, "w")
serialize(fid1, randn(1000, 1000))
close(fid1)
rm(fp)
end
end
N = 1
f_create_jld(N)
f_create_dlm(N)
f_create_h5(N)
f_create_slz(N)
Then setting N = 10
, I get:
julia> @time f_create_jld(N)
0.452258 seconds (924 allocations: 76.376 MiB, 1.94% gc time)
julia> @time f_create_dlm(N)
2.784344 seconds (10.02 M allocations: 418.429 MiB, 0.57% gc time)
julia> @time f_create_h5(N)
0.106710 seconds (214 allocations: 76.303 MiB, 1.11% gc time)
julia> @time f_create_slz(N)
0.105692 seconds (224 allocations: 76.313 MiB, 4.09% gc time)