Convolution of Dependent Random Variables with Copula

I’m running into domain issues when trying to compute the convolution of two random variables where their joint distribution is defined by a copula. By convolution I mean:

f_{Z}(z) = \int_{-\inf}^{\inf} f_{XY}(x,z-x)dx

Here is how I’m constructing the copula using Copula.jl and Distributions.jl:

X1 = Gumbel(15.,5.)
X2 = MixtureModel([Normal(-6.,0.1),X1],[0.95,0.05])

ρ = 0.4

C = GaussianCopula([1. ρ; ρ 1.])

D1 = SklarDist(C1,(X1,X2))

My first thought was to loop through various ‘z’ values as in:

N = 100
Zs = collect(range(-10.,60.,N))
ConvD = similar(Zs)
for k=1:N
    ConvD[k] = quadgk(x->pdf(D1,[x,Zs[k]-x]),-80.,80.,rtol=1e-6)[1]
end

which errors:

DomainError with 0.0:
integrand produced NaN in the interval (-80.0, 80.0)

Stacktrace:
 [1] evalrule(f::var"#15#16"{Int64}, a::Float64, b::Float64, x::Vector{Float64}, w::Vector{Float64}, wg::Vector{Float64}, nrm::typeof(norm))
   @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\evalrule.jl:39
 [2] #8
   @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:54 [inlined]
 [3] ntuple
   @ .\ntuple.jl:48 [inlined]
 [4] do_quadgk(f::var"#15#16"{Int64}, s::Tuple{Float64, Float64}, n::Int64, atol::Nothing, rtol::Float64, maxevals::Int64, nrm::typeof(norm), _segbuf::Nothing, eval_segbuf::Nothing)
   @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:52
 [5] #50
   @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:83 [inlined]
 [6] handle_infinities(workfunc::QuadGK.var"#50#51"{Nothing, Float64, Int64, Int64, typeof(norm), Nothing, Nothing}, f::var"#15#16"{Int64}, s::Tuple{Float64, Float64})
   @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:189
 [7] quadgk(::var"#15#16"{Int64}, ::Float64, ::Vararg{Float64}; atol::Nothing, rtol::Float64, maxevals::Int64, order::Int64, norm::Function, segbuf::Nothing, eval_segbuf::Nothing)
   @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:82
 [8] top-level scope
   @ In[59]:5

Initially, I’d used a Gamma distribution instead of the Gumbel and ran into this error, which made since given that the code was attempting to evaluate the Gamma outside its support. I switched to the Gumbel since it and the Normal distribution have support across all Reals and didn’t expect to see the same error.

I’m wondering if this issue stems from the pdf of the copula returning NaN, when it should actually be returning 0?
For example,

pdf(D1,[-30., 0.])
#returns NaN

Any insight would be greatly appreciated!

Maybe pass -Inf, 0, +Inf as the integration limits?

By passing -Inf, +Inf as the bounds, quadgk will rescale the integral adaptively to approximate the infinite limits. By putting an explicit subinterval endpoint at 0, the adaptive integration will never evaluate the integrand exactly at that point.

Not sure what you’re planning to do with these values, but presumably some type of interpolation? If you want to interpolate a smooth function over a finite domain, I would strongly consider using Chebyshev interpolation instead, with unequally spaced points. e.g. via FastChebInterp.jl

using FastChebInterp
Zs = chebpoints(N, -10, 60)
ConvD = map(Zs) do z
    quadgk(x->pdf(D1,[x,z-x]), -Inf, 0, Inf, rtol=1e-6)[1]
end
c = chebinterp(ConvD, -10, 60)

at which point you can do c(z) to evaluate the interpolant at any point in the interval. If the function is smooth in z, this should converge exponentially fast with N.

I don’ have an informed opinion on this, but maybe file an issue with Copula.jl?

Thank you for the suggestion to use Chebyshev interpolation- that is a good idea. Still running into domain errors (below), so I’ll file an issue with Copula.jl

X1 = Gumbel(15.,5.)
X2 = MixtureModel([Normal(-6.,0.1),X1],[0.9,0.1])

ρ = 0.4

C = GaussianCopula([1. ρ; ρ 1.])

D1 = SklarDist(C,(X1,X2))



N = 100
Zs = chebpoints(N, -10, 60)
ConvD = map(Zs) do z
    quadgk(x->pdf(D1,[x,z-x]), -Inf, 0, Inf, rtol=1e-6)[1]
end
c = chebinterp(ConvD, -10, 60)
DomainError with -0.5:
integrand produced NaN in the interval (-1.0, 0.0)

Stacktrace:
  [1] evalrule(f::QuadGK.var"#12#21"{var"#2#4"{Float64}, Float64}, a::Float64, b::Float64, x::Vector{Float64}, w::Vector{Float64}, wg::Vector{Float64}, nrm::typeof(norm))
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\evalrule.jl:39
  [2] #8
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:54 [inlined]
  [3] ntuple
    @ .\ntuple.jl:49 [inlined]
  [4] do_quadgk(f::QuadGK.var"#12#21"{var"#2#4"{Float64}, Float64}, s::Tuple{Float64, Float64, Float64}, n::Int64, atol::Nothing, rtol::Float64, maxevals::Int64, nrm::typeof(norm), _segbuf::Nothing, eval_segbuf::Nothing)
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:52
  [5] #50
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:83 [inlined]
  [6] handle_infinities(workfunc::QuadGK.var"#50#51"{Nothing, Float64, Int64, Int64, typeof(norm), Nothing, Nothing}, f::var"#2#4"{Float64}, s::Tuple{Float64, Float64, Float64})
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:169
  [7] #quadgk#49
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:82 [inlined]
  [8] quadgk
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:80 [inlined]
  [9] quadgk
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:77 [inlined]
 [10] #1
    @ .\In[3]:4 [inlined]
 [11] iterate
    @ .\generator.jl:47 [inlined]
 [12] _collect(c::Vector{Float64}, itr::Base.Generator{Vector{Float64}, var"#1#3"}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base .\array.jl:854
 [13] collect_similar(cont::Vector{Float64}, itr::Base.Generator{Vector{Float64}, var"#1#3"})
    @ Base .\array.jl:763
 [14] map(f::Function, A::Vector{Float64})
    @ Base .\abstractarray.jl:3285
 [15] top-level scope
    @ In[3]:3

I tried a brute force approach:

function F(f::SklarDist,v::Vector{Float64})
    a = pdf(f,v)
    if isnan(a)
        a = 0.
    end
    return a
end
N = 1000
Zs = chebpoints(N, -10, 60)
ConvD = map(Zs) do z
    quadgk(x->F(D1,[x,z-x]), -200., 200., rtol=1e-6)[1]
end
c = chebinterp(ConvD, -10, 60)

But am still getting some off results:

plot(scatter(;x=Zs,y=c.(Zs)))

Just a cross-ref to the issues, for the archimedeans:

and another one for the gaussians:

The archimedeans looks like they are fixed, waiting for comment from Op, while the gaussians are a bit trickier due to the position of the relavant code (Distributions.jl and/or PDMats.jl)