I was glad to use the GSL until I discovered that the evaluation of the CG coefficients in some scattering amplitudes takes 50% of the total time. A native Julia implementation (perhaps, case-specific) turned to be 4 times (!) faster.
It seems that there is a room for optimization in the WignerSymbols.jl. I am happy to help if authors can point out what the right strategy could be. Having a dedicated, tested package is obviously preferable to the customary implementation.
I have a similar implementation of the Wigner-d functions. AFAIK, they do not exist in julia yet (?).
I am wondering if creating a new package (e.g. HEP-specific) is a good idea.
It looks like most of your 1000 calls are trivial cases where ClGd returns 0.0; maybe it would be better to benchmark only the nontrivial cases?
It looks like you can get some additional speedup by eliminating your f_logfact(n) and just doing logfact[n+1] directly with an @inbounds directive (assuming that you’ve previously checked the arguments to ensure in-bounds accesses). (Even without the @inbounds, the n < 0 check in f_logfact is redundant with bounds-checking.) (To get some slight additional speedup, you could replace logfact[1+div(x,2)] with logfact2[1+x], where logfact2 is a double-length array in which each entry is repeated twice.) Also, isn’t res = 0 type-unstable?
Hi Steven, these are great suggestions!
The amount of zeros is ≈ 17%, and 14% are equal to 1.0. I tried to exclude obviosly-trivial Clebsches in the rand_clebsch() function (+fixed a typo).
Just checked the other suggestions:
type stability + logfact2 gives 1-2 μs gain
@inbounds removes extra 2-4 μs for the same tests.
Not so much, but still cool. Thanks a lot for pointing out!
Did you notice if I am doing something wrong with SymPy calls? My colleagues running tensorflow used to calculate both the Clebsch-Gordan coefficientss and the Wigner d using the sympy.physics.wigner. It is hard to believe that the call time which they are working with is x1000. I have not benchmarked it thoroughly, tho.
WignerSymbols.jl could use a bit of performance tuning, but I would mention that the prime factorization method in that package is able to compute symbols to very high quantum numbers exactly, in the form of roots of Rational{BigInt}.
This is somewhat orthogonal to the methods you’ve benchmarked, but I thought it might be helpful to mention – I’ve recently implemented (yet another) method in WignerFamilies.jl. In my field (cosmology) the Wigner 3j-symbols are not usually used via random access, but instead in summations over all nontrivial symbols for some fixed quantum numbers. The Wigner 3j-symbols are subject to convenient recurrence relations which can provide all symbols in a family with relatively few floating-point operations per symbol.
For example, there are 2001 nontrivial 3j symbols varying j with fixed j_2 = 1000, j_3 = 1000, m_2=2, m_3=-2, m_1 = -m_2 -m_3.
Hi, thanks for the comment. Good to know about WignerFamilies.jl, I should add a link to the package in my README. Indeed, my clebsches would not work above j=50 without extending precalculated factorials (well, it is never needed in HEP).