Performance challenge: can you write a faster sum?
|
|
31
|
1270
|
July 9, 2025
|
Huge performance variance with `if` options in loops
|
|
4
|
154
|
July 4, 2025
|
How to make the most of SIMD.jl when number of data elements is not divisible by SIMD width
|
|
5
|
179
|
May 22, 2025
|
Datatypes SIMD by default? (as in Mojo)
|
|
14
|
1103
|
May 15, 2025
|
Is it ok to iterate inside a `@simd` loop?
|
|
8
|
282
|
April 25, 2025
|
Min/max swap using SIMD
|
|
18
|
233
|
March 12, 2025
|
Autovectorization in Julia 101
|
|
2
|
341
|
December 5, 2024
|
Experiments with LoopVectorization and convolutions
|
|
24
|
822
|
December 3, 2024
|
Optimizing sums of products (dot products)
|
|
17
|
595
|
September 24, 2024
|
Vectorization of multivariate normal PDF
|
|
1
|
115
|
June 26, 2024
|
Failure to vectorize 8 Int64 multiplies when 8 Float64 multiplies vectorize
|
|
8
|
431
|
May 31, 2024
|
SIMD.jl/shufflevector without support for SIMD vector as mask?
|
|
2
|
298
|
February 11, 2024
|
Supporting SIMD-enabled objective functions in optimization APIs
|
|
3
|
364
|
February 5, 2024
|
Different `@code_llvm` output on macos and x86
|
|
4
|
315
|
December 8, 2023
|
Optimizing Direct 2D Convolution Code
|
|
14
|
717
|
November 23, 2023
|
Understanding the performance and overhead of a vector of SOA vs a vector of AOS for SIMD and the effect of push!
|
|
1
|
444
|
June 23, 2023
|
Is this a valid use of simd?
|
|
2
|
311
|
June 16, 2023
|
Optimizing Direct 1D Convolution Code
|
|
21
|
1581
|
April 28, 2023
|
```@turbo``` producing different (and wrong) results compared to ```@inbounds @simd```
|
|
3
|
409
|
March 30, 2023
|
Question on multithreading/vectorizing loops
|
|
9
|
681
|
March 22, 2023
|
Major performance boost when precaching random inputs to ```exp```?
|
|
8
|
770
|
September 25, 2022
|
Vectorize but break early?
|
|
3
|
427
|
September 20, 2022
|
LoopVectorization: @turbo performs worse than @inbounds on trivial loop
|
|
9
|
2095
|
August 28, 2021
|
PaddedViews very slow
|
|
7
|
605
|
August 25, 2021
|
Why is this @simd loop faster than a while loop even if it has longer assembly?
|
|
6
|
1612
|
August 1, 2021
|
SIMD Complex Numbers
|
|
19
|
2672
|
July 22, 2021
|
LoopVectorization.jl vmap gives an error ::VectorizationBase.Vec{4, Int64}
|
|
17
|
976
|
July 22, 2021
|
A simple SIMD.jl loop that is slower than a vanilla `@inbounds @simd`
|
|
8
|
1863
|
June 27, 2021
|
Why is this small `@inline` function much slower than an equivalent macro?
|
|
2
|
896
|
June 26, 2021
|
How to do SIMD code with wide-register accumulators (@simd vs LoopVectorization.jl vs SIMD.jl)
|
|
11
|
2600
|
June 22, 2021
|