Performance of norm function

foobar_lv2 · September 8, 2018, 8:52pm

Hah, thanks!

So no reason to make a PR, since this affects tiny vectors only, and people caring about speed on tiny vectors presumably use SVector already.

That being said, the obvious fast variant still outperforms the BLAS on my system (which presumably is a misconfiguration / build issue).

julia> A=rand(10_000);
julia> function foo2(A) 
                x = zero(eltype(A))
                @inbounds  @simd for v in A
                  @fastmath x += v * v
                end
                @fastmath sqrt(x)
              end
julia> @btime norm($A)
  4.643 μs (0 allocations: 0 bytes)
57.51021904090062

julia> @btime foo2($A)
  1.376 μs (0 allocations: 0 bytes)
57.51021904090062

Topic		Replies	Views
Orders of magnitude runtime difference in row-wise norm Performance	5	358	March 21, 2023
Improving function performance - Broadcasting vs Loops Performance	14	1980	August 17, 2021
Does the julia intrinsic sum() apply fastmath by default? New to Julia fast-math , sum	15	769	February 21, 2023
Comparing performance of 2 simple averaging functions - why is one faster? Performance	5	502	August 31, 2020
@inbounds: is the compiler now so smart that this is no longer necessary? Performance	33	2901	July 16, 2018

Performance of norm function

Related topics