Solved: The main problem was that Normal accepts Normal(μ,σ). I thought it was Normal(μ,σ^2). The other discrepancies are still a little mysterious but this explains why my Gauss Hermite was so off.
I want to compute the integral of a logit function with respect to a normal distribution. Specifically I want to compute: \int_{-\infty}^{\infty} \frac{exp(5\beta)}{1+\exp(5\beta)} \cdot \exp( \frac{-(\beta - \mu)^2}{2\sigma^2}) d\beta with \mu=0.5, \sigma = 2.
I try three different approaches to computing the integral: 1) GaussGK, 2) Monte Carlo, 3) Gauss-Hermite with FastGaussQuadrature package.
For the first two approaches I get the same answer up to 3 digits but the third approach (Gauss-Hermite) is not close to the other two. Question: why not? I also would’ve thought the first two approaches would have been a little closer, but that’s not my main concern.
First I write a function for the logit part (to deal with big numbers problems) using softmax from StatsFuns package and check that it’s correct.
function f(β,x)
exp(β'*x)/(1 + exp(β'*x))
end
function logi(β,x=5)
softmax([β'*x,0])[1]
end
# check to make sure logi == f
x = 5*rand(10)
β = 5*rand(10)
f(β,x) == logi(β,x) # -> true
1) QuadGK package:
using StatsFuns
using Distributions
d = Normal(0.5,2)
function bilog(β, x=5, dist=Normal(0.5,2))
return logi(β,x) * pdf(dist, β)
end
using QuadGK
tol = 10^-14
quant_tol_l, quant_tol_u = (quantile(d,tol), quantile(d,1-tol) )
# bounds chosen so that not much is left over
integral, err = quadgk(bilog, quant_tol_l, quant_tol_u, rtol=tol)
# -> (0.5971674, 5.3732205e-15)
2) Monte Carlo approach
nodes = quantile(d,rand(1000000))
mean(logi.(nodes, 5)) # -> 0.5975626
# would have thought 0.5975626 would match 0.597167 from
# above a little more closely but not too bad
# I have run this a few to times since it is random so this is not a "special" draw.
3) Gauss-Hermite approach.
Here’s where the real problem is. I try to use Gauss-Hermite quadrature which is described here. I follow the formula for the change of variables as described there but I am getting a significantly different answer from above. Any idea why?
using FastGaussQuadrature
nodes, weights = gausshermite( 1000000 )
sum(weights) # -> 1.772 should this be 1?
vals = (β -> logi(β*sqrt(2)*sqrt(2) + 0.5)).(nodes)
(1/sqrt(π))*vals'*weights # -> 0.63406 not equal .597