I would like to do a piece-wise reduction (on sub-arrays) of a large array that is too big to fit into memory. Is there a ‘storage-based array’ that implements @views?
Are you looking for GitHub - JuliaArrays/MappedArrays.jl: Lazy in-place transformations of arrays?
Thanks @Oscar_Smith, I’m not really looking for implicit mapping but rather a data structure whose allocation size is limited by storage (not memory) that I can get sub-arrays of. Basically this would allow me to port a lot of existing code to handle some new data.
The Mmap package provides memory-mapped file-based arrays that sound like they should do what you want. Only the portion you are accessing is paged into memory.
(Every AbstractArray subtype supports
@views and subarrays.)
Thanks, @stevengj. Even better that it’s in the standard library.
Hi, @stevengj. For benchmarking the Mmap-based implementation, what is the best way to create some files with large random arrays?
The following (adapted from the docs) obviously doesn’t work because it has to create and write the whole array at once:
n=10 s = open("/tmp/mmap.bin", "w+") write(s, 10) write(s, 10^n) write(s, rand(10, 10^n))
You could just call
write in a loop…, e.g.
open("mmap.bin", "w") do io write(io, 10^10) for i = 1:1000 write(io, rand(10^7)) end end
do construct required here? I used almost the same for-loop strategy in this post but if fails.
No, but it’s good style when writing a file, since it automatically closes the file at the end of the
do...end even if an exception occurs. It’s equivalent to
io = open(...) try ...write stuff... finally close(io) end
but is less verbose.
Also I realized that the
w+ was what was wrong with my loop, apparently allowing creation on
open() also prevents
write() from appending. when I changed it to
w like yours it worked…