Numerical integration for array

I have a function f(x1, x2) that returns an array. I need to compute a definite integral for each element of the returned array over a space of (x1, x2).

I am considering writing a Monte Carlo integration inside function f. But is there a better way of doing this?

1 Like

Have you looked at Cubature.jl?

https://github.com/JuliaMath/Cubature.jl

There’s also

https://github.com/giordano/Cuba.jl

1 Like

HCubature.jl is a native Julia port of Cubature.jl and will be easier to use for this sort of thing, because it can integrate basically any type of Julia object (that lives in a normed vector space). For example,

julia> using HCubature

julia> hcubature(x -> [x[1]*x[2]^2, x[1]-x[2]], [0,0], [1,2])
([1.333333333333333, -1.0], 4.440892098500626e-16)

computes \int_0^1 dx_1 \int_0^2 dx_2 \begin{pmatrix} x_1 x_2^2 \\ x_1 - x_2 \end{pmatrix} = \begin{pmatrix} 4/3 \\ -1 \end{pmatrix}.

(If your integrand consists of small vectors like this, you might want to return an SVector from StaticArrays.jl.)

In low dimensions (< 7) for smooth functions, Monte Carlo integration is usually not competitive with cubature schemes based on polynomial interpolation, such as HCubature.

8 Likes

Thanks for your reply!
I read the documentation but still not sure if this would work in my case (sorry I’m still new to Julia!). Let me describe what I am trying to do.

I defined the following function

function myfunc(δ, θ2, y, c, d, ϵ1, ϵ2)
    N = size(y, 1)
    βY = θ2[1:5]
    γ0 = θ2[6]
    γ1 = θ2[7:11]
    τ = θ2[12]
    κ = θ2[13]
    s = d + 1/sqrt(τ) * ϵ1                                       
    p = p0./(p0 .+ (ones(N) - p0) .* exp.((0.5*ones(N) - s).*τ))   
    Eu = δ .+ y*βY - (γ0 .+ y*γ1).*(ones(N) - c).*p .+ ϵ2 .- (log(κ) - 1)
    b = isequal.(sign.(Eu), 1.0)                        
    return b
end

The input are: δ (scaler), θ2 (13-by-1 vector), y (N-by-5 matrix), c, d, ϵ1, ϵ2 (N-by-1 vector). This function returns a N-by-1 vector, and N is around 1000. What I really want is a vector whose elements are the expectation of b’s elements over ϵ1, ϵ2, which are standard normal variables and mutually independent.

Is it possible to do the integration within the function, so instead of having ϵ1, ϵ2 as inputs, having the function directly return the calculated expectations?

I take it that these are N samples of the distributions, and for any sample they are just scalars.

Yes, if I understand you correctly, just pass the function that computes b(ϵ1, ϵ2) to an integration routine (weighted by the normal distribution for expectation values with Gaussian ϵ). You can do this as an anonymous function ϵ -> ... within the function, as long as your inputs give you enough information to compute b for an arbitrary ϵ. (It’s not clear if you have enough information to do this, though; e.g. is the input y supposed to be a function y(ϵ) in general? Of course, you can pass function arguments if needed.)

A numerical difficulty you might encounter, however, is that isequal.(sign.(Eu), 1.0) looks like you will need to integrate a discontinuous indicator function. Discontinuous functions are rather expensive to integrate numerically (unless you can exploit analytical knowledge of the discontinuity), but in 2d it might not be too bad. Be sure to specify a coarse tolerance to the cubature routine, e.g. rtol=0.01, since integrating to high accuracy (the default is 8 digits) might not be tractable.

It’s hard to say more without an actual working example that shows how to get the inputs to your routine. (Also, you’ll want a function that returns your integrand b for given scalar ϵ1, ϵ2.)

1 Like

In my case, input y is a numerical matrix that does not depend on ϵ. I tried to write terms inside the function as functions of (ϵ1, ϵ2):

function myfunc(δ, θ2, y, c, d)
    N = size(y, 1)
    βY = θ2[1:5]
    γ0 = θ2[6]
    γ1 = θ2[7:11]
    τ = θ2[12]
    κ = θ2[13]
    s(ϵ1) = d + 1/sqrt(τ) * ϵ1                                          
    p(ϵ1) = p0./(p0 .+ (ones(N) - p0) .* exp.((0.5*ones(N) - s(ϵ1)).*τ))   
    Eu(ϵ1, ϵ2)  = δ .+ y*βY - (γ0 .+ y*γ1).*(ones(N) - c).*p(ϵ1) .+ ϵ2 .- (log(κ) - 1)
    b(ϵ1, ϵ2) = isequal.(sign.(Eu(ϵ1, ϵ2)), 1.0)         
    integrand(ϵ1, ϵ2) = b(ϵ1, ϵ2).*(1/(2*π)*exp.(-(ϵ1.^2 .+ ϵ2.^2)/2))
    (expectation, err) = hcubature(ϵ -> integrand(ϵ1, ϵ2), [-5,-5], [5,5])
    return expectation
end

I got error message “ERROR: UndefVarError: ϵ1 not defined”, probably because the way I call hcubature is wrong?

the parameter values I use are

delta = -20
betaY = [0.05; -0.01; -0.1; 0.25; 0.11] 
gamma0 = 0.05
gamma1 = [0.1; -0.05; -0.05; -0.1; -0.1]
tau = 1
kappa = 1
theta2 = [betaY; gamma0; gamma1; tau; kappa]

myfunc(delta, theta2, y, c, d)

Yes, the anonymous function call inside hcubature is wrong.
You probably meant ϵ->integrand(ϵ[1], ϵ[2]) that is given a collection ϵ=[ϵ1,ϵ2] as input you pass its first and second element to integrand

1 Like

(Side note: you can do (1 .- p0) here and avoid the allocation of a vector of 1’s. Similarly, your other calls to ones are unnecessary. Also, p0 isn’t defined in your code; is it a global?
If p0 is scalar, then p(ϵ1) is a scalar function and you can omit all of the dots in that function.)

3 Likes

I guess I can’t use “integrand(ϵ[1], ϵ[2])” becasue ϵ1, ϵ2 are both N-by-1 vector-valued. If I call

hcubature(ϵ -> integrand(ϵ[1], ϵ[2]), [-5,-5], [5,5])

I got error: hcubature(ϵ → integrand(ϵ[1], ϵ[2]), [-5,-5], [5,5])
ERROR: MethodError: no method matching +(::Array{Int64,1}, ::Float64)
For element-wise addition, use broadcasting with dot syntax: array .+ scalar.

Yes p0 is a global N-by-1 vector.
I am integrating over an indicator function because I want to compute the probability of an event. Given that, would hcubature be more efficient than Monte Carlo if we want the same precision?

It depends on what the function looks like and what accuracy you need. hcubature is adaptive — it will place more quadrature points around the discontinuities, which will help if there are only discontinuities in a few places or if you need relatively high accuracy. Monte-Carlo converges slowly, but it’s also relatively insensitive to how discontinuous the function is.

1 Like

I’m confused. I thought ϵ1, ϵ2 (= ϵ[1], ϵ[2] once you use hcubature) are your integration variables, in which case they must be scalars?

Oh let me clarify a bit. What the function does is an element-wise calculation, but I wrote input and output as vectors. So ϵ1,ϵ2, and the output are N-by-1 vectors. For each i=1:N, the integration is over ϵ1[i] and ϵ2[i].

Should I rewrite the function in a scaler form to make the integration work?

I tried the scaler version of the function. The integration is much slower that what I expected:

@time hcubature(ϵ -> integrand(ϵ[1], ϵ[2]), [-5,-5], [5,5])
 44.020727 seconds (1.16 G allocations: 23.453 GiB, 6.95% gc time)
(0.9914009943563031, 1.4772989088891442e-8)

Then I followed your advice to specify a coarse tolerance. It became much faster:

julia> @time hcubature(ϵ -> integrand(ϵ[1], ϵ[2]), [-5,-5], [5,5], rtol=0.01)
  0.249641 seconds (657.46 k allocations: 37.089 MiB)
(0.9950785576667626, 0.0041406851841439)

(It will probably become even faster if you modify it to not use global variables. Ideally, if you do @btime integrand(0.3,0.4) it should report 0 allocations.)

Note also that if you reduce the tolerance then you can probably also reduce the integration domain, since at 1% tolerance you don’t care about the tails of the Gaussians.

4 Likes

My code is working but I am frustrated by the speed. I am trying to find the “right value of delta” by minimizing the squared distance between observed binary choices and model-predicted choice probabilities. My code for model-predicted probability:

function choice_prob_given_θ2(δ, θ2, y, c, d, p0)
    βY = θ2[1:5]
    γ0 = θ2[6]
    γ1 = θ2[7:11]
    τ = θ2[12]
    κ = θ2[13]
    s(ϵ1) = d + 1/sqrt(τ) * ϵ1                                             
    p(ϵ1) = p0/(p0 + (1 - p0) * exp((0.5 - s(ϵ1))*τ))           
    Eu(ϵ1, ϵ2)  = δ + y'*βY - (γ0 + y'*γ1)*(1 - c)*p(ϵ1) + ϵ2 - (log(κ) - 1)
    b(ϵ1, ϵ2) = isequal(sign(Eu(ϵ1, ϵ2)), 1.0)                  
    integrand(ϵ1, ϵ2) = b(ϵ1, ϵ2)*(1/(2*π)*exp(-(ϵ1^2 + ϵ2^2)/2))    
    (expectation, err) = hcubature(ϵ -> integrand(ϵ[1], ϵ[2]), [-3,-3], [3,3], rtol=0.01)
    return expectation
end

and sum of squared distance:

function nls_obj(δ, θ2, y, c, d, p0, B)
    prob_given_θ2 = [choice_prob_given_θ2(δ, θ2, y[i,:], c[i], d[i], p0[i]) for i = 1:size(y,1)]
    obj = (B - prob_given_θ2)'*(B - prob_given_θ2)
    return obj
end

Each call of nls_obj really takes a while, especially when delta gets close to the “right value”. Is there a way to further speed it up? Do you have any suggested way to run the minimization?

I replaced the 2d integration with a 1d integration over a normal CDF, using ``normcdf’’ from StatsFuns.jl. The program gives the same results but is hundreds of times faster. Very happy with this solution!

2 Likes

Yes, doing one of the integrals analytically (using special functions as needed) is the way to go if it is an option, especially for a function with discontinuities.

3 Likes