RAM needed to initialise large matrices

Hi, I have a basic doubt on memory management. I am currently working on a high-dimensional problem in which I would need to use a handful of matrices with size 500,000 x 500,000. I was wondering how much ram is needed to initialise similar matrices of floats.

I have tried using varinfo() after having initialised a matrix A=zeros(100000, 100000) to get a rough idea, but I am not sure I understand the output. In fact, it says that A requires about 74.506 GiB (which sounds odd to me, considering that I am testing it on a MacBook Air with substantially less RAM).

How should I read the output of varinfo()? Also, is it enough to multiply it by 5 to get a rough idea of the RAM required for some matrix B=zeros(500000, 500000)?

You don’t get me to try this on windows. Back on the envelope calculation gives me a memory requirement of 80 GB. So for 500.000 x 500.000 you should multiply by 25.

Edit: checked it: clearly a troll.

Edit: @fipelle : you can’t risk system integrity for unsuspecting observers.

julia> A=zeros(100, 100);

julia> varinfo()
  name                    size summary                               
  –––––––––––––––– ––––––––––– ––––––––––––––––––––––––––––––––––––––
  A                 78.164 KiB 100Γ—100 Matrix{Float64} 

julia> 100*100*64/8/1024
78.125

In double precision, an n \times n matrix requires 8n^2 bytes.

In short, you won’t be able to handle n = 500,000 (without a huge supercomputer), since that corresponds to 2TB of memory. People working with such large matrices almost invariably exploit some special structure, e.g. sparsity (if your matrix is mostly zero). We could give you more specific advice, but we’d need to know more about your problem.

8 Likes

Thank you!

That’s exactly what I am trying to do for handling the last huge matrix in my problem - I did manage to reduce the other matrices into smaller blocks. The last one is a selection matrix (ones and zeros, mostly zeros) of size 500,000 x 500,000.

Any suggestion? I am trying to use Sparse Arrays Β· The Julia Language but a specialised structure for selection matrices would certainly be better.

That is not what you are trying to do with zeros.

1 Like

No, I was trying to get a rough idea of the memory needed for a matrix with size 500,000 x 500,000 - as indicated in the OP.

But others might try that on their system?

Forcing system instability due to OOM. In my eyes that is not OK.

It should be evident from the history of my account that I am using this discourse properly and certainly not to troll other people - as I have just noticed you mentioned on top. While quite odd, my MacBook does not crash when initialising that matrix - and I am still wondering why btw.

If you have relevant suggestions, I could really use some help. However, I do not think that polluting this post with similar conspiracies is of any use.

5 Likes

This doesn’t apply to Windows as far as I can tell (I checked). I accept this as kind of an apology.

Edit: but this could be quite a serious difference of memory management between the supported platforms.

Another edit: I didn’t really wait for my system to crash, only observed system monitor showing more and more memory being allocated.

virtual memory plus swap file sufficiently larger than physical ram.

2 Likes

Thank you Jeff! Would you please expand on that?

What’s happening here is that you are requesting the memory, but the OS only actually gives you physical memory when you write to it. Calling zeros doesn’t write to memory, so the OS doesn’t actually have to give you any memory.

2 Likes

In modern computer systems, a process’s memory address space is mapped to physical ram through page tables. Blocks of memory (aka pages) can be offloaded to disk in the swap or paging file, and reloaded later as needed.

1 Like

So @fipelle should see indications of this going on in his system monitor, too? (i.e. not that a big difference in memory management between Windows and Mac OS then)

I see. I am still going to use a sparse representation, but just for the sake of clarity: as long as there’s enough space on my hard disk it should work, right?

It depends on the swap file configuration. The OS usually puts a limit on the swap file size. Paging will lead to very poor performance.

2 Likes

Do we have a difference in demand paging here?

From the sources:

https://answers.microsoft.com/en-us/windows/forum/all/physical-and-virtual-memory-in-windows-10/e36fb5bc-9ac8-49af-951c-e7d39b979938

2 Likes