[ANN] Bessels.jl - Bessel, Hankel, and Airy functions

We are excited to announce Bessels.jl which is a special function library written in Julia to compute Bessel and Modified Bessel functions, Spherical Bessel functions, Airy functions, and Hankel functions. Currently the library supports single and double precision results for real arguments of any order.

There are several advantages of this library as it is written entirely in Julia, generally high performant, and does not allocate. Bessels.jl provides a lot of overlapping functionality with SpecialFunctions.jl and tries to match or exceed the accuracy of SpecialFunctions.jl in most domains. The available functions can be found in the API.

Benchmarks

Some speed comparisons to SpecialFunctions.jl can be found in the Readme. Here are a few comparisons though note that different algorithms are used for different domains in both libraries.

# Bessel functions of the first kind
julia> @benchmark SpecialFunctions.besselj(v, x) setup=(v=rand()*100; x = rand()*100)
BenchmarkTools.Trial: 4681 samples with 803 evaluations.
 Range (min โ€ฆ max):  162.007 ns โ€ฆ 11.076 ฮผs  โ”Š GC (min โ€ฆ max): 0.00% โ€ฆ 0.00%
 Time  (median):       1.121 ฮผs              โ”Š GC (median):    0.00%
 Time  (mean ยฑ ฯƒ):     1.329 ฮผs ยฑ  1.505 ฮผs  โ”Š GC (mean ยฑ ฯƒ):  0.01% ยฑ 0.53%

  โ–ˆโ–ƒโ–‚ โ–ƒโ–‡โ–ˆโ–ˆโ–‚             โ–โ–                                     โ–‚
  โ–ˆโ–ˆโ–ˆโ–‡โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–†โ–…โ–ˆโ–ˆโ–ˆโ–†โ–†โ–…โ–โ–ƒโ–„โ–ˆโ–ˆโ–ˆโ–…โ–โ–ƒโ–โ–‡โ–ˆโ–†โ–…โ–โ–โ–โ–‡โ–‡โ–ˆโ–ˆโ–†โ–…โ–โ–โ–โ–โ–โ–โ–โ–ƒโ–‡โ–ˆโ–†โ–โ–โ–โ–โ–โ–โ–„โ–ˆ โ–ˆ
  162 ns        Histogram: log(frequency) by time      9.45 ฮผs <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> @benchmark Bessels.besselj(v, x) setup=(v=rand()*100; x = rand()*100)
BenchmarkTools.Trial: 10000 samples with 983 evaluations.
 Range (min โ€ฆ max):   58.547 ns โ€ฆ 356.359 ns  โ”Š GC (min โ€ฆ max): 0.00% โ€ฆ 0.00%
 Time  (median):      99.869 ns               โ”Š GC (median):    0.00%
 Time  (mean ยฑ ฯƒ):   144.048 ns ยฑ  80.418 ns  โ”Š GC (mean ยฑ ฯƒ):  0.00% ยฑ 0.00%

  โ–ˆ                                                              
  โ–ˆโ–ƒโ–‚โ–‚โ–‚โ–‚โ–‚โ–ˆโ–„โ–‚โ–‚โ–ˆโ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–…โ–‚โ–‚โ–‚โ–‚โ–โ–โ–‚โ–โ–โ–โ–โ–‚โ–โ–โ–‚โ–โ–โ–โ–โ–โ–‚โ–ƒโ–ƒโ–ƒโ–ƒโ–„โ–„โ–„โ–ƒโ–„โ–ƒโ–ƒโ–ƒโ–ƒโ–ƒโ–ƒโ–‚โ–‚ โ–ƒ
  58.5 ns          Histogram: frequency by time          278 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

# Modified Bessel function of second kind and zero order
julia> @benchmark SpecialFunctions.besselk(0, x) setup=(x = rand()*100)
BenchmarkTools.Trial: 10000 samples with 807 evaluations.
 Range (min โ€ฆ max):  121.483 ns โ€ฆ  1.209 ฮผs  โ”Š GC (min โ€ฆ max): 0.00% โ€ฆ 70.69%
 Time  (median):     197.039 ns              โ”Š GC (median):    0.00%
 Time  (mean ยฑ ฯƒ):   213.963 ns ยฑ 49.302 ns  โ”Š GC (mean ยฑ ฯƒ):  0.08% ยฑ  1.08%

            โ–ƒ โ–ˆโ–ˆ                                                
  โ–‚โ–โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–ˆโ–ˆโ–ˆโ–ˆโ–†โ–ˆโ–‡โ–†โ–…โ–…โ–„โ–„โ–„โ–ƒโ–ƒโ–ƒโ–ƒโ–ƒโ–ƒโ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚ โ–ƒ
  121 ns          Histogram: frequency by time          449 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> @benchmark Bessels.besselk(0, x) setup=(x = rand()*100)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min โ€ฆ max):  16.096 ns โ€ฆ 25.895 ns  โ”Š GC (min โ€ฆ max): 0.00% โ€ฆ 0.00%
 Time  (median):     18.430 ns              โ”Š GC (median):    0.00%
 Time  (mean ยฑ ฯƒ):   18.425 ns ยฑ  0.270 ns  โ”Š GC (mean ยฑ ฯƒ):  0.00% ยฑ 0.00%

                                                   โ–โ–ˆโ–ˆโ–ˆโ–‚โ–ƒโ–ƒโ–ƒโ–‚โ–ƒ โ–‚
  โ–‡โ–ˆโ–‡โ–„โ–…โ–„โ–โ–ƒโ–โ–โ–โ–โ–„โ–…โ–ƒโ–โ–โ–„โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–โ–„โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ โ–ˆ
  16.1 ns      Histogram: log(frequency) by time      18.8 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Limitations

  1. No support for complex arguments. This is actively being developed and support for Airy functions in the entire complex plane will be available in the next few days.
  2. Does not yet work with automatic differentiation libraries. Though, differentiation wrt to argument are mostly covered by AD rules. For derivatives wrt to order please see BesselK.jl.
  3. Only support single and double precision. In the future we plan to support the Double64 type provided by DoubleFloats.jl.

Our current development is focused on tackling these limitations as well as supporting other functions such as inhomogenous Airy functions. There is a current discussion on the scope of this project in this issue. Please leave any comments on what type of special functions you would like to see supported or any other feature requests. That would be most helpful in focusing development effort.

And of course I would like to particularly acknowledge @Oscar_Smith for reviewing every single PR, contributing, and being a big motivation to further develop this project. I would also like to thank @cgeoga for providing very valuable additions to the code and helpful discussions.

Please feel free to ask any questions or provide any requests!

38 Likes

Nice๐Ÿ‘

Is the plan to keep this outside of SpecialFunctions.jl, or to eventually merge?

3 Likes

There is some ongoing discussion about this at this issue and it might be that SpecialFunctions.jl reexports these in the future once there is feature parity (support for complex numbers). There has been a lot of discussion about what to do with native julia functions (on discourse and scattered through issues) in SpecialFunctions.jl so I think that would have to happen with care. Happy to hear any thoughts on this!

2 Likes

This is a super nice package and an amazing effort! Looking forward to the next generation of special functions in Julia :clap: :juliaheartpulsing:

4 Likes

In my use cases, I often need j_i(x) for i=0,\dots,n, which I deal with using recursion. Is there any plan to support such use cases?

3 Likes

I was about the ask the same. I know the Fortran libamos implementation supports this. Would be great to have it in all julia with applications in multipole expansions etc.

1 Like

Do you just mean you want the spherical Bessel function for integer orders, or are you asking something more specific? If the former, the package already offers sphericalbesselj.

Thanks everyone for the feedback!

Of course! I have opened an issue and will get this done asap. It is fairly straightforward as the code is already written (several ranges are already filled by recurrence) but there are some small details to consider as I posted in the issue.

@cgeoga I believe they are asking to return a vector of values for different order ranges. This occurs a lot in some physics applications where you are summing over the order of Bessel functions so you need possibly thousands of Bessel calls at evenly spaced orders for a fairly slowly decaying sum.

4 Likes

Bessels.jl v0.2.3 has been released.

This release contains support for the complex Airy functions as well as methods to return a sequence of Bessel functions for many orders as suggested above. For example,

# Integer orders
 julia> besselj(0:20, 2.0)
21-element Vector{Float64}:
 0.223890779141235
 0.5767248077568717
 0.35283402861563673
 0.1289432494744017
 0.03399571980756834
 0.007039629755871665
 0.0012024289717899898
 โ‹ฎ
 7.183016356018772e-13
 4.506005896294031e-14
 2.6593078051678655e-15
 1.4817372491340163e-16
 7.819243273363721e-18
 3.918972805090719e-19

# Non-integer orders
julia> besselj(0.4:20.4, 2.0)
21-element Vector{Float64}:
 0.4741922813342985
 0.5153678749961258
 0.24732274366027757
 0.07820670978854029
 0.018580069620759396
 0.0035455965428010617
 0.0005661517103663385
 โ‹ฎ
 2.3910458293263756e-13
 1.463098085067114e-14
 8.435030183690724e-16
 4.5971668950721226e-17
 2.3756903241980582e-18
 1.1672333872110012e-19

These routines are much faster than broadcastingโ€ฆ

julia> @btime besselj(0:500, 400.0)
  1.515 ฮผs (1 allocation: 4.06 KiB)

julia> @btime besselj.(0:500, 400.0)
  76.910 ฮผs (1 allocation: 4.06 KiB)

julia> @btime SpecialFunctions.besselj.(0:500, 400.0)
  805.024 ฮผs (1 allocation: 4.06 KiB)

# And just as fast for non-integer arguments
julia> @btime besselj(0.2:500.2, 400.0)
  1.962 ฮผs (1 allocation: 4.06 KiB)

julia> @btime besselj.(0.2:500.2, 400.0)
  65.365 ฮผs (1 allocation: 4.06 KiB)

julia> @btime SpecialFunctions.besselj.(0.2:500.2, 400.0)
  1.256 ms (502 allocations: 11.89 KiB)

Thank you everyone for the feedback and any other feature requests are very welcome!

-Michael

9 Likes

Note that as usual in Julia, loops are so fast that vectorized functions are typically suboptimal, so in many applications it is may be faster to not allocate a whole vector of values (e.g. to sum a Bessel series) and instead just call besselj(n, x) for the last two n values and then apply the recurrence J_{n-1}(x) = \frac{2n}{x} J_{n}(x) - J_{n+1}(x) yourself in your loop (to sum the series or whatever).

(Of course, this requires you to know the existence of the recurrence.)

1 Like

Yeah. That is how the vectorized version is implimented. This happens to be a case where having vectorized versions is useful because there are a lot of different cases you may be in, so itโ€™s complicated enough that itโ€™s just easier to use the vectorized version.

1 Like

Yes - definitely. You could potentially avoid this allocation completely if you wanted and do the recurrence relation within your loop. One thing (from the user end) that I wanted to make sure is that because for besselj you must use backwards recurrence there is the potential that besselj(nu[end], x)=0.0 and then recurrence wonโ€™t work. This happens actually pretty quickly if x is small. For example,

julia> besselj(160, 1.0)
0.0

so the user would need to be careful in that scenario. On the other hand, besselj can be used in the forward direction if nu < x so I do handle that case as well which is in general faster. But this is good to point out if that allocation is killer but there are some important details to handle for a lot of the edge cases.

One other thing I was thinking about though if this allows for better vectorization of inner loops as I donโ€™t believe LoopVectorization.jl can handle special functions that well. It might be best (as recurrence canโ€™t to my knowledge be vectorized well) to compute the calls to Bessel functions outside your loop and then have the inner loop possibly be able to vectorize. This is a bit out of my knowledge though and I know @Oscar_Smith and @Elrod have talked about better vectorized versions of special functions.

1 Like

And to look at a very specific application of @stevengj point (and to provide an example of how to implement this). If we look at computing the Struve function in terms of its Bessel series (http://dlmf.nist.gov/11.4.E19)

H_{\nu}(z) = (\frac{z}{2\pi})^{1/2} \sum_{k=0}^\infty \frac{(z/2)^k}{k! (k+1/2)} J_{2k+\nu+1}(z)

In code using Bessels.jl this can be computed likeโ€ฆ

function struveh_bessel_series(v, x::T; Iter=100) where T
    x2 = x / 2
    two_x = 2 / x
    out = zero(T)

    # compute besselj(v, x) and besselj(v+1, x)
    jnup1 = besselj(Iter + T(3)/2 + v, x)
    jnu = besselj(Iter + T(1)/2 + v, x)

    # avoid overflow
    x2_pow = x2^(Iter/2)
    a = x2_pow
    for k in Iter:-1:0
        out += a / (k + T(1)/2) * jnu
        a *= k / x2
        jnup1, jnu = jnu, muladd((k + T(1)/2 + v)*two_x, jnu, -jnup1)
    end
    return out*sqrt(x2 / ฯ€) * (x2_pow / Bessels.gamma(Iter + 1)) 
end

This can be computed exceptionally fast without allocatingโ€ฆ

julia> @btime struveh_bessel_series(10.0, x; Iter=100) setup=(x=rand()+50.0)
  339.830 ns (0 allocations: 0 bytes)
2.259262882648611e6

This is actually faster (including a gamma call and everything) than simply allocating and returning the sequenceโ€ฆ

julia> @btime besselj(0:110, 50.0)
  537.877 ns (1 allocation: 1008 bytes)

What might be less obvious is that using less terms in the sum is slowerโ€ฆ

julia> @btime struveh_bessel_series(10.0, x; Iter=45) setup=(x=rand()+50.0)
  671.843 ns (0 allocations: 0 bytes)

because the starting values of the Bessel functions are hitting very optimized branches (nu >> x) instead of a suboptimal branch (nu ~ x). There are more details on my PR on calculating the Struve functions which reminds me that to just move all the Struve function calculations to Bessels.jl because of the code overlap. Though if everyone likes that as a separate package let me know.

Iโ€™d still urge caution when doing this because it is possible you will underflow, hit a slow branch in the Bessel functions, use unstable recurrence, or just implement recurrence suboptimally. Otherwise, as I showed here you can beat this by avoiding the allocation.

5 Likes