Why is LoopVectorization deprecated?

lmiq · April 2, 2024, 12:06am

I’m not sure how general this is, but here is one example of “manual” loop vectorization:

Probably, for the time being, people wanting to recover the performance of LV should study how to do something of the sort in their own problems.

Tarny_GG_Channie · April 2, 2024, 9:16am

There are some discussions on GPU in this thread. I would say that there are some tasks that require low latency such that the latency of transferring the data to the GPU is not worth the extra compute, but is still branchless, suitable for SIMD. That’s where loopvectorization shines.

TheLateKronos · April 4, 2024, 6:54am

^ This. I am not sure how feasable it wold actually be, but given how many packages rely on LV for performance gains, and how important performance is to the Julia comunity, it seems like a replacement for LV would be a splendid candidate a new (upgradable) Julia standard library!

nsajko · April 4, 2024, 8:25am

On the contrary, I suppose a proper solution would be integrated into the Julia compiler and/or LLVM? Which is exactly what LoopModels is supposed to facilitate, I think?

martin.d.maas · April 10, 2024, 7:23am

Too sad to learn about this. LoopVectorization.jl is THE package that blew my mind when I re-discovered Julia about 3 years ago. With LV, I could write super-readable code and get the same performance than manual simd-optimized code, which is crazy-good.

That’s why I’m generally skeptical of bleeding-edge Julia packages that make use of compiler internals nowadays. They are not really getting supported and maintaining them becomes an uphill battle.

martin.d.maas · April 10, 2024, 5:27pm

As a solution, perhaps we should all learn more about SIMD and create some simple packages, tailored to some specific use cases of SIMD, that are also easier to maintain. In my case, LV was doing the magic for my work involving complex functions… for most other cases I encountered, @fastmath @inbounds @simd does the trick of getting nearly-optimal performance.

wujinq · April 22, 2024, 10:55pm

The README file of LoopVectorization now reads “Looking for new maintainers, otherwise deprecated in Julia 1.11.” How hard is it to find new maintainers? What does it take for an ordinary user to understand LoopVectorization and maintain it?

ctkelley · May 1, 2024, 4:32pm

Just tried again with

Version 1.12.0-DEV.446 (2024-05-01)
LLVM: libLLVM-17.0.6 (ORCJIT, apple-m2)
 JULIA_LLVM_ARGS = -enable-vplan-native-path

and the results were no better than they were with LLVM 16

ufechner7 · May 1, 2024, 4:43pm

Just try to read the code and try to understand it. If you understand 30% of it you can become the new maintainer…

ufechner@framework:~/repos/LoopVectorization.jl$ scc -x toml .
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Julia                       95     33398     1231      2113    30054       4045
Markdown                    22      2195      523         0     1672          0
YAML                         7       283        8        10      265          0
C                            2       348       11         0      337         83
FORTRAN Modern               2       457       18        14      425         10
C++                          1       116       14         5       97          0
License                      1        19        3         0       16          0
SVG                          1        12        0         0       12          0
gitignore                    1        21        2         0       19          0
───────────────────────────────────────────────────────────────────────────────
Total                      132     36849     1810      2142    32897       4138
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $1,058,301
Estimated Schedule Effort (organic) 14.05 months
Estimated People Required (organic) 6.69
───────────────────────────────────────────────────────────────────────────────
Processed 1099814 bytes, 1.100 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

If we assume that you need to invest 20% of the time that was needed to write this code for understanding it and becoming a maintainer, just raise 200,000$ and you can pay someone to pick up this role…

Tarny_GG_Channie · May 2, 2024, 2:12am

I don’t know if the estimate would be accurate. This library is far from being typical code.

ericphanson · May 2, 2024, 10:35am

I think SCC tends to overestimate the complexity of julia code, but I think LoopVectorization is a lot more complicated than typical julia code, so it may actually be more accurate there. As a comparison, I checked out v1.0.0 of my ExplicitExports.jl, which I know took about 15-25 hours over 3-4 days, and got

───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Julia                       15      1895      353       213     1329        218
Markdown                     4       220       72         0      148          0
YAML                         3       115        0         2      113          0
License                      1        21        4         0       17          0
gitignore                    1         6        0         0        6          0
───────────────────────────────────────────────────────────────────────────────
Total                       24      2257      429       215     1613        218
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $44,628
Estimated Schedule Effort (organic) 4.22 months
Estimated People Required (organic) 0.94
───────────────────────────────────────────────────────────────────────────────
Processed 89070 bytes, 0.089 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

which seems very inflated.

Elrod · May 2, 2024, 11:04am

It is inflated. It was only ever a hobby project done in my spare time, certainly not 7 years of my salary.
Many bugs aren’t that hard to fix, and may not require knowing much about the code base at all.

tecosaur · May 2, 2024, 11:47am

This is a tangent, but I’m actually finding SCC (which I’ve first heard of here) rather funny. The cost estimation is a fun idea, but I just tried it in my Emacs config and:

Estimated Cost to Develop (organic) $2,599,402
Estimated Schedule Effort (organic) 19.77 months
Estimated People Required (organic) 11.68

Eben60 · May 2, 2024, 1:22pm

It sets corporate overhead to the factor of 2.4 by default.

tbeason · May 2, 2024, 1:44pm

Don’t sell yourself short! If someone is willing to front that cost, let them!

mbauman · May 2, 2024, 3:07pm

Regardless of the exact monetary value, 36k loc in 100+ files is definitely not nothing.

maxfreu · July 25, 2024, 10:50am

I’ll add another 200$ for making it fit for 1.11. Is there some coordinated way to make such donations to julia projects?

mkitti · July 25, 2024, 1:20pm

See SciML Small Grants Program Current Project List

DNF · July 25, 2024, 6:44pm

Having just watched the “state of Julia” talk from Juliacon, I’m pretty sure it was said during the Q&A that LV has been updated to work on 1.11. But after checking on github, I couldn’t see that it’s been announced.

ChrisRackauckas · July 25, 2024, 8:45pm

github.com/JuliaSIMD/LoopVectorization.jl

LoopVectorization.jl causing segfaults on 1.11

opened 01:59PM - 10 Jan 24 UTC

closed 11:38AM - 07 May 24 UTC

maleadt

LoopVectorization.jl's generated IR seems to cause segfaults on 1.11, as observe…d on PkgEval with at least 6 packages (MCPhylo,jl, LocalPoly.jl, VectorizedReduction.jl, NaNStatistics.jl, TimeSeriesClassification.jl, PlmDCA.jl). See this report for details: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/2cbecf4_vs_18b4f3f/report.html @chriselrod I'm opening a new issue because https://github.com/JuliaSIMD/LoopVectorization.jl/issues/518 was closed, and to list all issues in case somebody wants to tackle this. --- Some of the errors that I've encountered: An LLVM assertion, as seen with MCPhylo.jl (requires assertions build of Julia): ``` julia: /workspace/srcdir/llvm-project/llvm/lib/IR/Instructions.cpp:2561: void llvm::InsertValueInst::init(llvm::Value*, llvm::Value*, llvm::ArrayRef<unsigned int>, const llvm::Twine&): Assertion `ExtractValueInst::getIndexedType(Agg->getType(), Idxs) == Val->getType() && "Inserted value must match indexed type!"' failed. [177] signal 6 (-6): Aborted in expression starting at /home/pkgeval/.julia/packages/MCPhylo/KWPlY/test/distributions/phylodist.jl:1 gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) unknown function (ip: 0x7fbeac0f040e) __assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) _ZN4llvm15InsertValueInst4initEPNS_5ValueES2_NS_8ArrayRefIjEERKNS_5TwineE at /opt/julia/bin/../lib/julia/libLLVM-15jl.so (unknown line) InsertValueInst at /source/usr/include/llvm/IR/Instructions.h:2640 [inlined] Create at /source/usr/include/llvm/IR/Instructions.h:2565 [inlined] CreateInsertValue at /source/usr/include/llvm/IR/IRBuilder.h:2343 emit_new_struct at /source/src/cgutils.cpp:3870 emit_new_struct at /source/src/julia.h:1704 emit_expr at /source/src/codegen.cpp:5945 emit_ssaval_assign at /source/src/codegen.cpp:5367 emit_stmtpos at /source/src/codegen.cpp:5642 [inlined] emit_function at /source/src/codegen.cpp:8810 jl_emit_code at /source/src/codegen.cpp:9144 jl_emit_codeinst at /source/src/codegen.cpp:9227 _jl_compile_codeinst at /source/src/jitlayers.cpp:220 jl_generate_fptr_impl at /source/src/jitlayers.cpp:525 jl_compile_method_internal at /source/src/gf.c:2509 [inlined] jl_compile_method_internal at /source/src/gf.c:2397 _jl_invoke at /source/src/gf.c:2912 [inlined] ijl_apply_generic at /source/src/gf.c:3097 logpdf at /home/pkgeval/.julia/packages/MCPhylo/KWPlY/src/distributions/Phylodist.jl:118 ``` A segfault during `vload`, as seen with NaNStatistics.jl and PlmDCA.jl: ``` [60] signal 11 (2): Segmentation fault in expression starting at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/test/testArrayStats.jl:80 macro expansion at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/llvm_intrin/memory_addr.jl:987 [inlined] __vload at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/llvm_intrin/memory_addr.jl:987 [inlined] _vload at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/strided_pointers/stridedpointers.jl:95 [inlined] macro expansion at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/vecunroll/memory.jl:60 [inlined] _vload_unroll at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/vecunroll/memory.jl:535 [inlined] _vload at /home/pkgeval/.julia/packages/VectorizationBase/0dXyA/src/vecunroll/memory.jl:771 [inlined] macro expansion at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined] _turbo_! at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined] _nanmean at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:344 __nanmean at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:308 [inlined] #nanmean#5 at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:307 [inlined] nanmean at /home/pkgeval/.julia/packages/NaNStatistics/oBRaH/src/ArrayStats/ArrayStats.jl:307 ``` A segfault during `vadd_fast` as seen with VectorizedReductions.jl: ``` [12] signal 11 (2): Segmentation fault in expression starting at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/test/reduce.jl:4 macro expansion at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/llvm_intrin/binary_ops.jl:31 [inlined] vadd_fast at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/llvm_intrin/binary_ops.jl:31 [inlined] fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined] fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined] fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined] fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined] fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined] fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined] fmap at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:11 [inlined] vadd_fast at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/vecunroll/fmap.jl:111 [inlined] add_fast at /home/pkgeval/.julia/packages/VectorizationBase/xE5Tx/src/base_defs.jl:91 [inlined] macro expansion at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined] _turbo_! at /home/pkgeval/.julia/packages/LoopVectorization/7iB2K/src/reconstruct_loopset.jl:1107 [inlined] macro expansion at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/src/vmapreduce.jl:236 [inlined] vvmapreduce at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/src/vmapreduce.jl:231 vvreduce at /home/pkgeval/.julia/packages/VectorizedReduction/bsnWJ/src/vmapreduce.jl:147 ``` The source of bad IR hasn't been fully determined yet, but it seems to be the `Expr(:new)` that's generated to pass structs by value instead of by reference: https://github.com/JuliaLang/julia/issues/52702#issuecomment-1874492883. --- Deprecating LoopVectorization.jl isn't possible, because: - some packages, e.g. RecursiveFactorizations.jl, inspect LoopVectorization.jl internals: https://github.com/JuliaSIMD/LoopVectorization.jl/issues/520 - the transformation of `@turbo` changes semantics, https://github.com/JuliaSIMD/LoopVectorization.jl/pull/523#issuecomment-1884883071 So the only solution forwards seems fixing LoopVectorization.jl. I've taken a first attempt at it in https://github.com/JuliaSIMD/LoopVectorization.jl/pull/523, but just removing the `Expr(:new)` optimization isn't sufficient, and there's other issues (see above).

check_empty=true/false didn’t change between early Julia versions and 1.10.
We could make check_empty=true the default.

Closing this because tests pass on 1.11, while check_empty=false should’ve also caused segfaults in older Julia versions. Test cleanup · JuliaSIMD/LoopVectorization.jl@eeaa0b2 · GitHub

But if anyone wants to contribute towards v1.12, the SciML Small Grants outlines that project and we’d be happy to up the ante. Maintenance never ends, so it’ll need v1.13 updates too!

Topic		Replies	Views
Should one use Julia to create libraries? General Usage question	5	265	February 6, 2025
Can I try updating LoopVectorization? General Usage loopvectorization	4	381	January 26, 2025
Autovectorization in Julia 101 Internals & Design simd , loopvectorization	2	342	December 5, 2024
[ANN] LoopVectorization Package Announcements	157	23240	May 27, 2020
[ANN] VectorizationTransformations.jl Package Announcements package , announcement , linearalgebra	2	504	December 26, 2023

Why is LoopVectorization deprecated?

Related topics