Evaluate integral on many points (Cubature.jl ?)


#1

I need to compute an integral on the real line. The tricky part is that I need to compute this integral from many different starting points.

The integration package Cubature.jl, which I was trying to use, allows to compute on only one interval at a time. The best solution would seem to sort all the points on which I’d want to evaluate f (say x_0,x_1 and so on) and use pquadrature on every small interval. However this still doesn’t seem very efficient (if I have many points it looks like I’m better off manually approximating g with a piecewise linear function and then computing everything analytically, exploiting the fact that the piecewise approximation has to be computed only once).

I wanted to ask whether there exist a package/some already implemented functionality to improve on my manual solution (I’ve also tried Dierckx, to get a spline that approximates the function and then compute the integral on the spline, but that also doesn’t seem to improve things).


#2

This sounds like a perfect application for the ApproxFun package. Basically, it will approximate your integrand g(s) by a polynomial to nearly machine precision on some interval, and then (if you call cumsum) it will compute the integrated polynomial on that interval. Then you just evaluate the resulting polynomial to get the integral (super quickly) for any starting point you want.

(This assumes that your integrand is smooth, or at least piecewise smooth, so that ApproxFun can efficiently represent it by polynomials.)

(ApproxFun will be much more efficient than spline fits, assuming you have a smooth function, because it can get high accuracy with many fewer points than with low-degree splines.)


#3

Thanks a lot! In my case the integrand is indeed smooth. It goes to 0 at infinity though, so maybe a polynomial fit is not ideal. How does ApproxFun handle these cases? It uses piecewise polynomial or it relies on the fact that the interval is finite?


#4

Yes ApproxFun can handle approximation on Rays:

julia> f = Fun(x->(1+exp(-x))/(x+1),0..Inf)
Fun(Chebyshev(【0.0,∞❫),[0.850398,-1.00477,0.158454,0.00816475,-0.00882948,-0.00405104,-0.000412124,0.000567883,0.000435382,0.000147932  …  -5.33649e-14,-4.57939e-14,-3.28735e-14,-1.92734e-14,-7.62483e-15,6.55032e-16,5.56645e-15,7.45874e-15,7.36911e-15,5.95701e-15])

julia> g = cumsum(f)
Fun(Chebyshev(【0.0,∞❫)⊕log((1+x)^1.0*(1-x)^0.0)[Chebyshev(【0.0,∞❫)],[1.06937,-1.0,0.308743,0.0,-0.0833093,0.0,-0.0136079,0.0,0.00491328,0.0  …  0.0,1.90336e-14,0.0,1.03703e-14,0.0,4.70381e-15,0.0,1.59985e-15,0.0,3.01621e-16])

Unfortunately, evaluation isn’t optimized so it will be a bit slow. You can work around it as follows:

julia> a,b=vec(g);

julia> @time a(2.0)+b(2.0)
  0.000047 seconds (85 allocations: 1.781 KB)
1.659490473772144

julia> sum(Fun(x->(1+exp(-x))/(x+1),0..2))  # matches sum over just 0..2
1.6594904737721394

#5

PS depending on how the integrand is stored, you may do better by manipulating Funs to construct the integrand, which can correctly determine the behaviour at infinity:

julia> x=Fun(0..Inf)
Fun((1-x)^-1.0[Chebyshev(【0.0,∞❫)],[1.0,1.0])

julia> f=(1+exp(-x))/(1+x^2)
Fun((1-x)^2.0[Chebyshev(【0.0,∞❫)],[0.500486,-0.162861,-0.147749,0.0471767,0.0276445,-0.0107343,-0.00663013,0.00137598,0.00135054,5.67424e-5  …  3.36514e-31,-2.37779e-31,-5.77252e-32,4.08095e-32,9.91499e-33,-6.98921e-33,-1.68054e-33,1.20356e-33,2.49961e-34,-1.97106e-34])


julia> g = cumsum(f)
Fun(Chebyshev(【0.0,∞❫),[1.18851,1.14872,-0.105019,-0.0584644,0.0144777,0.00685493,-0.00201837,-0.0011401,0.000164904,0.000158249  …  2.09702e-33,-1.47401e-33,-3.56001e-34,2.50255e-34,6.03934e-35,-4.24496e-35,-9.95106e-36,7.18289e-36,1.27531e-36,-1.00054e-36])

julia> @time g(2.0)
  0.000029 seconds (39 allocations: 880 bytes)
1.7124747491129804

#6

By the way, another way to think about what you doing is that your integral defines a new special function, in much the same way that standard special functions like erf(x) or Ei(x) are defined in terms of integrals with x in the integration limits. (I assume that you have already checked, e.g. with Wolfram Alpha, that your integral cannot be defined in terms of standard special functions.)

When people really care about the performance of evaluating special functions (e.g. if you are evaluating them millions of times in an inner loop), then they usually break the domain into pieces and use carefully tuned approximations in each piece. (e.g. Taylor series, minimax polynomial or rational-function fits, continued-fraction expansions, asymptotic series, etcetera.) Often, you take advantage of special properties of the function to accelerate things (e.g. reflection formulas, recurrence relations, or similar). It’s a bit of an art, and each new special function is different.

If you just want to approximate it on the real line, things are often quite a bit easier than if you need evaluation on the whole complex plane. Usually, you can work out the large-x asymptotics analytically, so that you don’t need to fit polynomials over infinite intervals. Once you’ve reduced it to finite intervals on the real line, an effective approach is usually some kind of minimax rational-function fits (maybe broken into several intervals), which various software can compute for you (e.g. the Remez.jl package in Julia).


#7

Thanks again for the very interesting comment. The main application that I have in mind is to compute gradient and hessian of cdf of some distributions with respect to the parameters of those distributions. So I guess that, when there is no close formula, those cdf are already implemented with the method you recommend (splitting the domain and building careful approximations). I’m not sure whether those constructions are easy to find and work with, to extend them from

to:

(if you are aware of some way to do so using the already implemented approximations, please let me know). So I’ll try to compute everything analytically for large x and use ApproxFun or Remez for small x.


#8

This seems like a smart thing to do for functions going to 0 at infinity. However it doesn’t seem to work with pdfs. For example:

x = Fun(0..Inf)
f2 = pdf(Gamma(),x)

gives

MethodError: no method matching pdf(::Distributions.Gamma{Float64}, ::ApproxFun.Fun{ApproxFun.JacobiWeight{ApproxFun.Chebyshev{ApproxFun.Ray{false,Float64}},ApproxFun.Ray{false,Float64}},Float64})
Closest candidates are:
  pdf(::Distributions.Gamma{T<:Real}, ::Real) at /Users/pietro/.julia/v0.5/Distributions/src/univariates.jl:320
  pdf(::Distributions.Distribution{Distributions.Univariate,Distributions.Continuous}, ::Real) at /Users/pietro/.julia/v0.5/Distributions/src/univariates.jl:102
  pdf(::Distributions.Distribution{Distributions.Univariate,S<:Distributions.ValueSupport}, ::AbstractArray{T,N}) at /Users/pietro/.julia/v0.5/Distributions/src/univariates.jl:205

Would it be hard to add support for this change of x axis on Distributions.jl? I guess one can do the change of axis to a finite domain manually, but that doesn’t seem ideal…

p.s. Congratulations on the ApproxFun package, I’ve been playing with it, it seems really cool!


#9

Do Fun(x -> pdf(Gamma(),x), 0..Inf)


#10

I’ve tried benchmarking ApproxFun a bit and I’m a bit confused by what I found.
Good news is the polynomials of relatively low degree can represent what I need after a careful re-parametrization (say degree around 50 does a very good job). Bad (and confusing) news is that the evaluation of those polynomials is much slower than what one would expect:
Example:

func =Fun(x -> 1. + x + x^2, 0..1)
@benchmark func(1.5)
BenchmarkTools.Trial: 
  memory estimate:  576.00 bytes
  allocs estimate:  26
  --------------
  minimum time:     1.356 μs (0.00% GC)
  median time:      1.645 μs (0.00% GC)
  mean time:        1.807 μs (4.00% GC)
  maximum time:     409.955 μs (97.66% GC)
  --------------
  samples:          10000
  evals/sample:     10
  time tolerance:   5.00%
  memory tolerance: 1.00%

Which is surprising: this computation should be on the order of nanoseconds (the polynomial has degree 2), instead it takes longer than evaluating the cdf of which I want to compute the partial derivatives.
I want to ask two things:

  • Is there a way to optimize this computation / is it WIP in the package?
  • I need to evaluate the polynomial on many points (~500000) so I’m happy to pay the price of converting the Chebyshev representation to a simpler/more efficient representation (normal polynomial series, i.e. a_0+a_1x+a_2x^2 and so on), is there a way to do so within ApproxFun ?

Thanks again for your help!


#11

Evaluation checks whether 1.5 in domain(func) and returns zero, so the cost is actually independent of the degree of polynomial (try Fun(Chebyshev(0..1),rand(1_000_000)).) If you call extrapolate it skips this check:

julia> @benchmark extrapolate(f,1.5)
BenchmarkTools.Trial: 
  memory estimate:  16.00 bytes
  allocs estimate:  1
  --------------
  minimum time:     37.703 ns (0.00% GC)
  median time:      40.176 ns (0.00% GC)
  mean time:        45.279 ns (1.87% GC)
  maximum time:     1.575 μs (95.04% GC)
  --------------
  samples:          10000
  evals/sample:     992
  time tolerance:   5.00%
  memory tolerance: 1.00%

This is about 4x faster than the equivalent code evalcfs(cfs,x) = cfs[1] + cfs[2]*(2x-1) + cfs[3]*(2x-1)^2.


#12

I submitted an issue about in being too slow:


#13

The extrapolate trick solves it! A fairer comparison would be with the horner method rather than the naive cfs[1] + cfs[2]*(2x-1) + cfs[3]*(2x-1)^2 (or with some metaprogramming tricks to develop the whole expression representing the series of multiplications and additions for a given polynomial) but performance makes much more sense now.


#14

Right. ApproxFun uses Clenshaw’s algorithm, which is the analogue of Horner’s scheme for Chebyshev series. I believe @MikaelSlevinsky added @clenshaw in src/LinearAlgebra/clenshaw.jl, which I think is the meta-programming version, but I don’t know how to use it.


#15

You also probably want to make f constant or interpolate it like:

julia> @benchmark extrapolate($f,1.5)

#16

The issue with the macro is that it needs to now the degree of the polynomial at compile time to be able to work. However a possible solution would be to use generated functions:

@generated function generated_horner{N, R<:Real}(x::R, p::NTuple{N,R})
    ex = :(p[$N])
    for i = N-1:-1:1
        ex = :(muladd(x, $ex, p[$i]))
    end
    return ex
end
const u = tuple(rand(5)...)
gen_eval(x) = gen_horner(x,u)
@benchmark $gen_eval(2.3)
BenchmarkTools.Trial: 
  memory estimate:  0.00 bytes
  allocs estimate:  0
  --------------
  minimum time:     2.293 ns (0.00% GC)
  median time:      2.308 ns (0.00% GC)
  mean time:        2.633 ns (0.00% GC)
  maximum time:     47.171 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000
  time tolerance:   5.00%
  memory tolerance: 1.00%

whereas a metaprogramming free implementation would give:

function normal_horner(x, a)
    b= a[end]
    for i = length(a)-1:-1:1
        b = a[i] + b * x
    end
    return b
end

@benchmark $normal_horner(2.3, uu)

BenchmarkTools.Trial: 
  memory estimate:  0.00 bytes
  allocs estimate:  0
  --------------
  minimum time:     15.868 ns (0.00% GC)
  median time:      16.308 ns (0.00% GC)
  mean time:        17.826 ns (0.00% GC)
  maximum time:     146.533 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     998
  time tolerance:   5.00%
  memory tolerance: 1.00%

but it’s also true that for larger polynomials the difference seems to decrease, I don’t know on what it depends. Still, if your polynomials will be evaluated many times, maybe you could consider this approach.


#17

I’ve been thinking of a plan_evaluate to make evaluation at many points fast, with the following usage:

f = Fun(...)
P = plan_evaluate(f)
x = rand(1_000_000)
vals = P.(x)  

Then plan_evaluate can take care of setting up an optimized Clenshaw/Horner implementation.


#18

Then you could just move plan_evaluate inside the constructor as well?


#19

Btw, I’ve just realized that the macro technique doesn’t really work for Chebyshev polynomial, because the expression grows exponentially in the degree of the polynomial (my computer crashes already for degree = 50), so the best would be translate to power series and apply the generated function technique there. I’ll try and see if I can implement a fast way of converting from Chebyshev to power series.


#20

Converting to power series is unstable.