Computing an integral for expectation over a normal distribution

Solved: The main problem was that Normal accepts Normal(μ,σ). I thought it was Normal(μ,σ^2). The other discrepancies are still a little mysterious but this explains why my Gauss Hermite was so off.

I want to compute the integral of a logit function with respect to a normal distribution. Specifically I want to compute: \int_{-\infty}^{\infty} \frac{exp(5\beta)}{1+\exp(5\beta)} \cdot \exp( \frac{-(\beta - \mu)^2}{2\sigma^2}) d\beta with \mu=0.5, \sigma = 2.

I try three different approaches to computing the integral: 1) GaussGK, 2) Monte Carlo, 3) Gauss-Hermite with FastGaussQuadrature package.

For the first two approaches I get the same answer up to 3 digits but the third approach (Gauss-Hermite) is not close to the other two. Question: why not? I also would’ve thought the first two approaches would have been a little closer, but that’s not my main concern.

First I write a function for the logit part (to deal with big numbers problems) using softmax from StatsFuns package and check that it’s correct.

function f(β,x)
   exp(β'*x)/(1 + exp(β'*x)) 
end

function logi(β,x=5)
    softmax([β'*x,0])[1]
end

# check to make sure logi == f
x = 5*rand(10)
β = 5*rand(10)

f(β,x) == logi(β,x) # -> true

1) QuadGK package:

using StatsFuns
using Distributions
d = Normal(0.5,2)
function bilog(β, x=5, dist=Normal(0.5,2))
   return logi(β,x) * pdf(dist, β)
end

using QuadGK
tol = 10^-14
quant_tol_l, quant_tol_u = (quantile(d,tol), quantile(d,1-tol) ) 
# bounds chosen so that not much is left over 

integral, err = quadgk(bilog, quant_tol_l, quant_tol_u, rtol=tol) 
# -> (0.5971674, 5.3732205e-15)

2) Monte Carlo approach

nodes = quantile(d,rand(1000000))
mean(logi.(nodes, 5)) # -> 0.5975626 
# would have thought 0.5975626 would match 0.597167 from 
# above a little more closely but not too bad
# I have run this a few to times since it is random so this is not a "special" draw.

3) Gauss-Hermite approach.

Here’s where the real problem is. I try to use Gauss-Hermite quadrature which is described here. I follow the formula for the change of variables as described there but I am getting a significantly different answer from above. Any idea why?

using FastGaussQuadrature
nodes, weights = gausshermite( 1000000 )
sum(weights) # -> 1.772 should this be 1?
vals = (β -> logi(β*sqrt(2)*sqrt(2) + 0.5)).(nodes) 
(1/sqrt(π))*vals'*weights # -> 0.63406 not equal .597

See if GitHub - QuantEcon/Expectations.jl: Expectation operators for Distributions.jl objects works. It hopefully does the change of variables correctly, so you could use it or copy the logic

(Note that you can just integrate from -Inf to +Inf, and it will give you about the same accuracy and efficiency.)

1 Like

The weights sum to 1.772 because they are not normalized – that is what the 1/sqrt(pi) is doing.

Not sure why the result is different.

Looks like you forgot the Jacobian factor from the change of variables?

The result of QuantEcon Expectations is:

using Expectations
dist = Normal(0.5, 2)
E = expectation(dist)
E(β -> logi(β))

which gives 0.58975 as opposed to 0.5971674 which is what I got from the first approach. confirms there something wrong with my Gauss-Hermite.

Ok I see (at least) the main problem! Normal accepts Normal(μ,σ). I thought it was Normal(μ,σ^2). IMHO that’s confusing since N(μ,σ^2) is the standard in my experience but what I know. Other differences seem to be from different number of nodes used (though monte carlo still seems to be a bit off).

Do you use the same number of nodes to compare?

No, you can see he used 1m nodes for bare GH and the default for Expectations, which is like 30?

2 Likes