This must have been asked before, but I couldn’t find it.
How do I use readbytes! to read into a preallocated matrix? Or is there a better function?
Currently I’m using this:
data = Array{Complex{Int16}}(10, 1000, 3, 1 ) # Allocate correct size, done once
dataBytes = reshape(data, prod(size(data)) )
dataBytes = reinterpret(UInt8, dataBytes) # dataBytes is now an alias of data reinterpreted as byte array
# do this in a processing loop many times
bytesRead = readbytes!(fid, dataBytes, sizeof(dataBytes) )
but isn’t there a more direct way to use data, instead of creating dataBytes as an alias to the data?
Not that I know of (short of re-implementing readbytes!
via low-level ccall
s). What’s wrong with using reinterpret
?
(Note that reading bytes directly into Int16
values in this way is endian-dependent, i.e. the binary files may not be portable to other architectures, although in practice right now almost everyone uses little-endian machines.)
In principle nothing is wrong with using reinterpret, it just is annoying that I need to reshape and reinterpret as separate steps.
Thus far on the problems I am mostly working with, trying to get high performance out of Julia in that particular domain, I have developed the hypothesis: that the key to performance is I need to be very careful of creating temporary variables. By now I’m a bit Obsessive Compulsive about any temporary variables no matter how big. At times my Julia coding is starting to feel like a fight to keep the GC away: Create temporary variables and reuse the the whole time (unless the infrequent occurrence of them changing size). Here I’m torn between wrapping functions in a let block to make the temporaries static or passing them in and cluttering the calling interface.
Originally the function performing the readbytes! was taking in the Complex{Int16} array (user interface) and then creating the Vector{UInt8} internally, this meant such a temporary variable every time this function is called. I have now settled with letting the user do the conversion outside once and have my function take it in as a Vector{UInt8}.
I keep feeling I want to be able to manually allocate a block of memory and then be able to at various times instantiate different types of arrays to utilise that as their data storage block. I’m not sure how one who do it, but I guess something like that would be possible in order to be able to interface to C function calls.
Among other things. No need to form hypotheses, see the performance tips. Preallocating outputs is there, but other things are equally important. Profiling and benchmarking will give you specific information for particular cases.
In general, this is a mistake. For operations on a large (length n >> 1) array, then any O(1) costs (e.g. creating a few small heap-allocated temporaries like reinterpret
wrappers) will often be negligible. This is especially true if you are doing O(n) I/O as with readbytes!
.
Of course, ultimately you have to do profiling and benchmarking to be certain of where your performance is going, but as a general rule I wouldn’t worry about small allocations outside of innermost loops.
Hoare’s famous quote about premature optimization and “small efficiencies” comes to mind.