I’m not sure how general this is, but here is one example of “manual” loop vectorization:
Probably, for the time being, people wanting to recover the performance of LV should study how to do something of the sort in their own problems.
I’m not sure how general this is, but here is one example of “manual” loop vectorization:
Probably, for the time being, people wanting to recover the performance of LV should study how to do something of the sort in their own problems.
There are some discussions on GPU in this thread. I would say that there are some tasks that require low latency such that the latency of transferring the data to the GPU is not worth the extra compute, but is still branchless, suitable for SIMD. That’s where loopvectorization shines.
^ This. I am not sure how feasable it wold actually be, but given how many packages rely on LV for performance gains, and how important performance is to the Julia comunity, it seems like a replacement for LV would be a splendid candidate a new (upgradable) Julia standard library!
On the contrary, I suppose a proper solution would be integrated into the Julia compiler and/or LLVM? Which is exactly what LoopModels is supposed to facilitate, I think?
Too sad to learn about this. LoopVectorization.jl is THE package that blew my mind when I re-discovered Julia about 3 years ago. With LV, I could write super-readable code and get the same performance than manual simd-optimized code, which is crazy-good.
That’s why I’m generally skeptical of bleeding-edge Julia packages that make use of compiler internals nowadays. They are not really getting supported and maintaining them becomes an uphill battle.
As a solution, perhaps we should all learn more about SIMD and create some simple packages, tailored to some specific use cases of SIMD, that are also easier to maintain. In my case, LV was doing the magic for my work involving complex functions… for most other cases I encountered, @fastmath @inbounds @simd
does the trick of getting nearly-optimal performance.
The README file of LoopVectorization now reads “Looking for new maintainers, otherwise deprecated in Julia 1.11.” How hard is it to find new maintainers? What does it take for an ordinary user to understand LoopVectorization and maintain it?
Just tried again with
Version 1.12.0-DEV.446 (2024-05-01)
LLVM: libLLVM-17.0.6 (ORCJIT, apple-m2)
JULIA_LLVM_ARGS = -enable-vplan-native-path
and the results were no better than they were with LLVM 16
Just try to read the code and try to understand it. If you understand 30% of it you can become the new maintainer…
ufechner@framework:~/repos/LoopVectorization.jl$ scc -x toml .
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Julia 95 33398 1231 2113 30054 4045
Markdown 22 2195 523 0 1672 0
YAML 7 283 8 10 265 0
C 2 348 11 0 337 83
FORTRAN Modern 2 457 18 14 425 10
C++ 1 116 14 5 97 0
License 1 19 3 0 16 0
SVG 1 12 0 0 12 0
gitignore 1 21 2 0 19 0
───────────────────────────────────────────────────────────────────────────────
Total 132 36849 1810 2142 32897 4138
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $1,058,301
Estimated Schedule Effort (organic) 14.05 months
Estimated People Required (organic) 6.69
───────────────────────────────────────────────────────────────────────────────
Processed 1099814 bytes, 1.100 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
If we assume that you need to invest 20% of the time that was needed to write this code for understanding it and becoming a maintainer, just raise 200,000$ and you can pay someone to pick up this role…
I don’t know if the estimate would be accurate. This library is far from being typical code.
I think SCC tends to overestimate the complexity of julia code, but I think LoopVectorization is a lot more complicated than typical julia code, so it may actually be more accurate there. As a comparison, I checked out v1.0.0 of my ExplicitExports.jl, which I know took about 15-25 hours over 3-4 days, and got
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Julia 15 1895 353 213 1329 218
Markdown 4 220 72 0 148 0
YAML 3 115 0 2 113 0
License 1 21 4 0 17 0
gitignore 1 6 0 0 6 0
───────────────────────────────────────────────────────────────────────────────
Total 24 2257 429 215 1613 218
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $44,628
Estimated Schedule Effort (organic) 4.22 months
Estimated People Required (organic) 0.94
───────────────────────────────────────────────────────────────────────────────
Processed 89070 bytes, 0.089 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
which seems very inflated.
It is inflated. It was only ever a hobby project done in my spare time, certainly not 7 years of my salary.
Many bugs aren’t that hard to fix, and may not require knowing much about the code base at all.
This is a tangent, but I’m actually finding SCC (which I’ve first heard of here) rather funny. The cost estimation is a fun idea, but I just tried it in my Emacs config and:
Estimated Cost to Develop (organic) $2,599,402
Estimated Schedule Effort (organic) 19.77 months
Estimated People Required (organic) 11.68
It sets corporate overhead to the factor of 2.4 by default.
Don’t sell yourself short! If someone is willing to front that cost, let them!
Regardless of the exact monetary value, 36k loc in 100+ files is definitely not nothing.
I’ll add another 200$ for making it fit for 1.11. Is there some coordinated way to make such donations to julia projects?
Having just watched the “state of Julia” talk from Juliacon, I’m pretty sure it was said during the Q&A that LV has been updated to work on 1.11. But after checking on github, I couldn’t see that it’s been announced.
check_empty=true/false
didn’t change between early Julia versions and 1.10.
We could makecheck_empty=true
the default.
Closing this because tests pass on 1.11, while
check_empty=false
should’ve also caused segfaults in older Julia versions. Test cleanup · JuliaSIMD/LoopVectorization.jl@eeaa0b2 · GitHub
But if anyone wants to contribute towards v1.12, the SciML Small Grants outlines that project and we’d be happy to up the ante. Maintenance never ends, so it’ll need v1.13 updates too!