I am transforming a dataset, and the process of transforming it both takes a long time and takes up a large amount of memory. I really only need a single row of the matrix in memory at any given time, so is there some efficient way I can write each row to file?
Ideally, I would like to be able to do so using @parallel.
By “matrix”, do you mean an array of fixed-size elements? If so, then have a look at mmap.
For writing one row at a time, it’s best if the array is stored in row-major order. If you need it to be column-major, then it’s probably more efficient to first store, say, 1024 rows in memory and then write that entire block to disk at once. (Of course mmap might already take care of this.) Rows within a block could be computed in parallel.