const count = 10000
function create()
v = rand(Int64, count)
open("data.bin", "w") do ofile
write(ofile, v)
end
end
function test()
open("data.bin") do ifile
v = Vector{Int64}(undef, len)
v = read!(ifile, v)
return v
end
end
This is a very simple, arguably trivial, thing to want to do. Other languages cope perfectly fine with this. For example, the runtime performance of the same code written in C++ is over 10,000 times faster.
C++ runtime: few nanoseconds
Julia runtime: nearly a whole millisecond
I canβt believe how bad it is. Itβs a joke. Itβs a totally trivial operation.
Read block of data using OS call
Perform O(1) reinterpret of returned data so that the runtime treats it as a Vector{Int64} instead of Vector{UInt8}. There is no need for this to do any type-safety or runtime type-checking nonsense.
Since the OS is doing most of the work here, there is no excuse for this.
This code doesnβt run as written. Assuming you meant to write,
function test()
open("data.bin") do ifile
v = Vector{Int64}(undef, div(count,8))
v = read!(ifile, v)
return v
end
end
Iβm seeing a time of 7.2 microseconds. Do you have the C++ code to show the few nanosecond timing? Iβm surprised by this given that NVME drives have a few microsends latency.
Youβre the one asking for assistance in the first place, not being paid for that is expected everywhere. We all participate for free and try to get along here. As for the work itself, youβre not doing enough for reproducibility or profiling. Provide all the code, including the benchmarking.
function readtest!(v, filename)
open(filename, "r") do fid
read!(fid, v)
end
return v
end
with a pre-allocated v, I get a runtime of 43us, which translates to 1.7GB/sec, reasonably close to what I belive to be my disk read speed.
Note that caching is likely to play some role here, not sure how to benchmark this in a bulletproof way, but something like that appears to be a problem in your C++ code as well, I would guess.
Specifically, dual channel DDR5 ram is only ~100GB/s so itβs somewhat unbelievable to me that this could be finishing in less than ~800 microseconds.