julia> *(2)
2
julia> @descend *(2)
[ Info: tracking Base
*(x::Number) @ Base ~/packages/julias/julia-1.9/share/julia/base/operators.jl:516
[ Info: This method only fills in default arguments; descend into the body method to see the full source.
516 *(x::Int64::Number)::Int64 = x
[...]
You are right. The problem is reduce
that apply the operation one element at the time:
julia> x = rand(5); y = rand(5);
julia> temp1 = x .* y
5-element Vector{Float64}:
0.021011105122337788
0.18748492243282094
0.29715362865317285
0.02001732616845591
0.5826489054162168
julia> z1 = norm(temp1)
0.6810086504806679
julia> z2 = norm(xi*yi for (xi,yi) in zip(x,y))
0.6810086504806679
julia> z1 == z2
true
julia> temp3 = map(*,x,y)
5-element Vector{Float64}:
0.021011105122337788
0.18748492243282094
0.29715362865317285
0.02001732616845591
0.5826489054162168
julia> temp3 == temp1
true
julia> z3 = reduce(norm,temp3)
0.021011105122337788
julia> z4 = norm(norm(norm(norm(temp3[1],temp3[2]),temp3[3]),temp3[4]),temp3[5])
0.021011105122337788
julia> z3 == z4
true
The correct version in this case is:
function f2(x,y)
z = 0.
for i in 1:10
z += sum( xi^2 * yi^2 for (xi,yi) in zip(x,y)) |> sqrt
end
z
end
This doesn’t allocate but of course it is specific to the toy example.
Hehe, why is that defined at all?
I was also confused until trying:
julia> +(1)
1
julia> *(1)
1
julia> ^(1)
ERROR: MethodError: no method matching ^(::Int64)
Closest candidates are:
^(::Number, ::Missing) at missing.jl:124
^(::Number, ::Rational) at rational.jl:481
^(::T, ::Complex{T}) where T<:Real at complex.jl:852
...
Stacktrace:
[1] top-level scope
@ REPL[3]:1
julia> +(1, 2, 3)
6
julia> *(1, 2, 3)
6
This is described in this footnote here: Mathematical Operations and Elementary Functions · The Julia Language
The operators +, ++ and * are non-associative. a + b + c is parsed as +(a, b, c) not +(+(a, b), c). However, the fallback methods for +(a, b, c, d…) and *(a, b, c, d…) both default to left-associative evaluation.
In other words, +
and *
are not truly binary operators like ^
.
For those who have extensive experience with StaticArrays
: as you increase the number of elements, at what point do you see a decrease in performance? The documentation says 100, but I’m curious what people’s real life experience suggests.
You can try to run this code https://github.com/JuliaArrays/StaticArrays.jl/blob/master/perf/README_benchmarks.jl
For different sizes ‘N’. I expect the result is highly machine specific but im interested in what you find.
So this is what I get for 3 and 10:
julia> simple_bench(3)
============================================
Benchmarks for 3×3 Float64 matrices
============================================
Matrix multiplication -> 9.3x speedup
Matrix multiplication (mutating) -> 1.4x speedup
Matrix addition -> 22.2x speedup
Matrix addition (mutating) -> 1.7x speedup
Matrix determinant -> 87.9x speedup
Matrix inverse -> 82.3x speedup
Matrix symmetric eigendecomposition -> 4.4x speedup
Matrix Cholesky decomposition -> 12.3x speedup
Matrix LU decomposition -> 7.1x speedup
Matrix QR decomposition -> 61.8x speedup
julia> simple_bench(10)
============================================
Benchmarks for 10×10 Float64 matrices
============================================
Matrix multiplication -> 3.8x speedup
Matrix multiplication (mutating) -> 2.4x speedup
Matrix addition -> 6.1x speedup
Matrix addition (mutating) -> 2.0x speedup
Matrix determinant -> 1.5x speedup
Matrix inverse -> 1.8x speedup
Matrix symmetric eigendecomposition -> 1.0x speedup
Matrix Cholesky decomposition -> 3.4x speedup
Matrix LU decomposition -> 1.5x speedup
Matrix QR decomposition -> 17.7x speedup
julia> versioninfo()
Julia Version 1.10.0-beta2
Commit a468aa198d0 (2023-08-17 06:27 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 16 × AMD Ryzen 7 4800U with Radeon Graphics
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, znver2)
Threads: 23 on 16 virtual cores
Environment:
LD_LIBRARY_PATH = /home/***/lib/
I started N=100 3 hours ago and it’s not done yet. So that answers that.
Yes, for N=100, you are asking the compiler to basically write out matrix methods for 10000-element matrices