Hello everyone. I updated Julia from version 1.7.2 to 1.8.0 and had a surprise: a code that used 2 GB of RAM, now uses 10 GB, and the number of subprocesses (htop) also increased from 8 to 64! Is this normal? I was a bit worried.
That’s really concerning. Any chance you can give an example that reproduces the behavior? is there a corresponding speed increase? Also, how many cores does your system have?
Whenever I can, I make sample code, but in this case there is no way, because it is a big code. I still don’t know if it is more efficient, because the execution time varies a lot, between 12 and 24 hours. The system is a 64-core Ubuntu 20.04 server. On my notebook, with 4 cores and 8 GB of RAM this increase did not occur.
Are you using a lot of matrix multiplies? I believe in 1.8 we increased the maximum number of BLAS threads.
Yes, I use the LinearAlgebra package a lot and make many products of ComplexF64 Hermitian matrices of dimensions 8x8x2x1000 and other smaller dimensions.
Try lowering the number of BLAS threads. (or if you want your program to be faster use MKL)
Interesting: BLAS threads 1.7.2 = 8; 1.8.0 = 32. I’ll test it tomorrow and get back to you.
Instead of just lowering the number of BLAS threads I recommend that you benchmark your code properly to see if using more BLAS threads gives you better performance (it probably should) as @Oscar_Smith said before. After all, if you have a 64 core system you typically don’t want to only use a small fraction of the cores.
Also, guessing from the number of cores (64), does this happen to be an AMD system? If yes, you might want to try BLISBLAS.jl or MKL.jl, see GitHub - carstenbauer/julia-dgemm-noctua: DGEMM Benchmarks on Noctua 1/2 at PC2.
I used the command BLAS.set_num_threads(8) right after using LinearAlgebra, and it had no effect.
Each run lasts an average of 18 hours, so it is hard to benchmark, but I will try anyway. I think it is great when I can use many cores, but I can’t always, because of other users.
Julia Version 1.8.0
Commit 5544a0fab76 (2022-08-17 13:38 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 Ă— AMD EPYC 7452 32-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, znver2)
Threads: 1 on 64 virtual cores
For initial testing, you should probably start by just testing the speed of complex hermitian matrix multiplication.
Could you try in Julia 1.9-DEV nightly? E.g. with:
julia --heap-size-hint=2GB
I would at least be intrigued to know if this new option, from 1.9 NEWS, helps:
New option
--heap-size-hint=<size>
gives a memory hint for triggering greedy garbage collection. The size might be specified in bytes, kilobytes(1000k), megabytes(300M), gigabytes(1.5G)
The input and output is the same (size), only some temporary intermediate computation larger, so this should help? Eventually with better GC wouldn’t be needed, and also we will all have 10 GB+ in the future, it will be peanuts…
From the text you posted from NEWS.md, it seems like it should be
julia --heap-size-hint=2G
(i.e. without the B
)
And where do I download Julia 1.9-DEV nightly, I download it from
Nightly builds but it didn’t work, I got the error:
/usr/local/bin/julia301b62ae73: cannot execute binary file: Exec format error
Maybe missing chmod, or see here:
Did you for sure download the right binary?
My somewhat older 1.9-DEV worked didn’t complain with:
julia --heap-size-hint=2GB
But it’s a good catch, it might not work if I’m reading the code correctly:
Does it actually parse as B (only last letter), just hitting the default/break case and then “2GB” = “2” (bytes)?
I found it likely to be intentional and equivalent (but I didn’t check if it was actually did the same or anything, GXXX also runs and XXX), but 1.9 isn’t released, and if G is in the spec, and GB isn’t, do not rely on it to work ever… It could be changed to work before release, or not (or even changed back to spec, by accident, though unlikely).
I had installed the wrong architecture. Now it’s right:
Julia Version 1.9.0-DEV.1172
Commit 18fa3835a78 (2022-08-23 13:44 UTC)
Using the --heap-size-hint=2GB or --heap-size-hint=2G option, it still shows in htop-VIRT = 9.6G instead of 2.0G as before. I also tested with and without BLAS.set_num_threads(8).
Julia also runs
julia --project=.juliadev --heap-size-hint=2any_string
You may need, or want to file an issue on this.
Yes, I want to, but I don’t know how to do it.
Providing a minimal example of bloating memory usage would be great.