I have compared 4 different implementations of the ClebschGordan coefficients

freshly coded in my
ThreeBodyDecay.jl
package  from
GSL.jl
call (c++ implementation)  from
SymPy.jl
(for the sake of completeness)  julia package
WignerSymbols.jl
The timing for 1000 calls of random coefficient with j₁, j₂ < 16 follows.
ThreeBodyDecay.jl 187 μs
GSL 743 μs
WignerSymbols.jl 764 μs
SymPy 1.61 s
Details are here.
 I was glad to use the GSL until I discovered that the evaluation of the CG coefficients in some scattering amplitudes takes 50% of the total time. A native Julia implementation (perhaps, casespecific) turned to be 4 times (!) faster.
 It seems that there is a room for optimization in the
WignerSymbols.jl
. I am happy to help if authors can point out what the right strategy could be. Having a dedicated, tested package is obviously preferable to the customary implementation.  I have a similar implementation of the Wignerd functions. AFAIK, they do not exist in julia yet (?).
I am wondering if creating a new package (e.g. HEPspecific) is a good idea.