Hi, I have a basic doubt on memory management. I am currently working on a high-dimensional problem in which I would need to use a handful of matrices with size 500,000 x 500,000. I was wondering how much ram is needed to initialise similar matrices of floats.
I have tried using varinfo() after having initialised a matrix A=zeros(100000, 100000) to get a rough idea, but I am not sure I understand the output. In fact, it says that A requires about 74.506 GiB (which sounds odd to me, considering that I am testing it on a MacBook Air with substantially less RAM).
How should I read the output of varinfo()? Also, is it enough to multiply it by 5 to get a rough idea of the RAM required for some matrix B=zeros(500000, 500000)?
In double precision, an n \times n matrix requires 8n^2 bytes.
In short, you won’t be able to handle n = 500,000 (without a huge supercomputer), since that corresponds to 2TB of memory. People working with such large matrices almost invariably exploit some special structure, e.g. sparsity (if your matrix is mostly zero). We could give you more specific advice, but we’d need to know more about your problem.
That’s exactly what I am trying to do for handling the last huge matrix in my problem - I did manage to reduce the other matrices into smaller blocks. The last one is a selection matrix (ones and zeros, mostly zeros) of size 500,000 x 500,000.
It should be evident from the history of my account that I am using this discourse properly and certainly not to troll other people - as I have just noticed you mentioned on top. While quite odd, my MacBook does not crash when initialising that matrix - and I am still wondering why btw.
If you have relevant suggestions, I could really use some help. However, I do not think that polluting this post with similar conspiracies is of any use.
What’s happening here is that you are requesting the memory, but the OS only actually gives you physical memory when you write to it. Calling zeros doesn’t write to memory, so the OS doesn’t actually have to give you any memory.
In modern computer systems, a process’s memory address space is mapped to physical ram through page tables. Blocks of memory (aka pages) can be offloaded to disk in the swap or paging file, and reloaded later as needed.