DNF
January 21, 2024, 6:58pm
21
It’s difficult to keep track of what your current code looks like, but if you still have this part there
you should fix it. This allocates and copies an entire vector, just to read its length. That’s very wasteful. Instead, just use size
which has virtually zero cost:
N = size(coupling_matrix, 2)
Elrod
January 22, 2024, 4:57am
22
Something is seriously wrong there. Are you on Julia 1.10? It added some type instabilities.
opened 05:21AM - 18 Jan 24 UTC
good first issue
Apparently, multiplying a vector of floats with a vector of ints causes problems… with the julia compiler. The return type `res` cannot be inferred anymore (`red::Any`) and this causes performance losses.
I have tested this on Julia 1.9.3, and there the example below works fine there, but not on Julia 1.10.
Of course, one can manually promote Integers to Floats before multiplying, at least as a workaround.
Minimum working example:
```
using LoopVectorization
function LVTest(a1,a2)
res = zero(eltype(a1))
@turbo for i in eachindex(a1,a2)
res += a1[i]*a2[i]
end
return res
end
aFloat = zeros(10)
aInt = zeros(Int,10)
@code_warntype LVTest(aFloat,aInt) #prints type Any for res on Julia 1.10 and does not show any type instabilities for 1.9
function checkAllocs()
aFloat = zeros(10)
aInt = zeros(Int,10)
LVTest(aFloat,aInt) # compile
println("Allocations: ",@allocated LVTest(aFloat,aInt))
end
checkAllocs() # prints 0 on Julia 1.9 but 2304 on Julia 1.10
```
The output of `versioninfo()`:
```
Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 8 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, haswell)
Threads: 1 on 8 virtual cores
Environment:
JULIA_PKG_USE_CLI_GIT = true
JULIA_DEPOT_PATH = /storage/niggeni/.julia_hexagon
JULIA_IMAGE_THREADS = 1
```
```
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 8 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, haswell)
Threads: 1 on 8 virtual cores
Environment:
JULIA_PKG_USE_CLI_GIT = true
JULIA_DEPOT_PATH = /storage/niggeni/.julia_hexagon
```
1 Like
You’re right. Declaring the elements of coupling_matrix as floats and introducing loop vectorization just helped a ton. I don’t have access to the same PC as before, but it just sped up about 4x. Thank you!
1 Like