Trigonometric functions do not use SIMD - Solved: use LoopVectorization.jl

Elrod · October 21, 2020, 8:59pm

That’s a problem with the documentation. It uses check_args to check if the arguments are supported:

julia> v1 = IntParam{Float64}(rand(80_000));

julia> v2 = IntParam{Float64}(rand(80_000));

julia> out = similar(v1);

julia> LoopVectorization.check_args(v1,v2,out)
false

The next major release of LoopVectorization (which I hope to have out by the end of the year) will use ArrayInterface.jl for this, which will change the recommended way of adding support.

But for now, defining these two methods should work:

LoopVectorization.check_args(::VComponent{T}) where {T} = LoopVectorization.check_type(T)
Base.pointer(v::VComponent) = pointer(v.v)

For example:

julia> v1 = IntParam{Float64}(rand(80_000));

julia> v2 = IntParam{Float64}(rand(80_000));

julia> out = similar(v1);

julia> @benchmark two_mul_sin_avx!($out, $v1, $v2)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     477.467 μs (0.00% GC)
  median time:      480.595 μs (0.00% GC)
  mean time:        480.921 μs (0.00% GC)
  maximum time:     519.928 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> LoopVectorization.check_args(::VComponent{T}) where {T} = LoopVectorization.check_type(T)

julia> Base.pointer(v::VComponent) = pointer(v.v)

julia> @benchmark two_mul_sin_avx!($out, $v1, $v2)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     81.493 μs (0.00% GC)
  median time:      82.477 μs (0.00% GC)
  mean time:        82.725 μs (0.00% GC)
  maximum time:     152.587 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

Topic		Replies	Views
SIMD Complex Numbers General Usage simd , complex-numbers	19	2672	July 22, 2021
[ANN] SIMDMathFunctions.jl : Fast vectorized mathematical functions for SIMD.jl , using SLEEFPirates.jl Package Announcements package , announcement	10	1128	March 18, 2024
Is it possible to use SIMD in multithreading? General Usage performance	5	1629	June 28, 2021
Help understanding vectorization (or lack thereof) Performance	15	1211	June 8, 2018
How to make the most of SIMD.jl when number of data elements is not divisible by SIMD width Performance simd	5	182	May 22, 2025

Trigonometric functions do not use SIMD - Solved: use LoopVectorization.jl

Related topics