Performance with nested QuadGK integrals

I encountered a performance issue whilst working. I’ve managed to produce a minimal example of my code that reproduces a performance issue. I have a file bug.jl:

import QuadGK

function inner_integrand()
    function integrand(t :: Float64)
        return 0.1
    end
    term1 = QuadGK.quadgk(integrand, 1, 2)[1]
    return term1
end

function outer_integrand()
    p = inner_integrand()
    return 0.5
end

function evaluate_double_integral()
    function integrand(theta_0)
        return outer_integrand()
    end
    return QuadGK.quadgk(integrand, 0, 0.1)[1] 
end

function compute_solution()
    println("Inside computation function")
    integro_diff_mtrx = 
        map(
            in->evaluate_double_integral(),
            [0]
        )
    println("Finished computation function")
    return
end    

I can use this with a script, testa.jl:

include("bug.jl")
println("Before computation function")
compute_solution()
println("After computation function")

I can run this from the command line: julia.exe testa.jl.
It takes a whole minute and 21 seconds to run.

Oddly, I can make it run much faster by making evaluating the inner integrand once before we attempt the double integral: testb.jl:

include("bug.jl")
println("Evaluating a function pointlessly.")
inner_integrand()
println("Before computation function")
compute_solution()
println("After computation function")

Now it takes 1.6 seconds. Much faster (but still surprisingly slow?).

Does this count as a bug, or is there something I’m missing?
I’m using Windows 10 x64. Issue in Julia 1.0.3.

I haven’t had a chance to look at your particular case, but I should say that in general I recommend against doing nested calls to any adaptive 1d quadrature routine. You can easily end up wasting lots of time refining one of the inner integrals even though it makes a negligible contribution to the overall 2d integral. You are usually better off with a 2d quadrature routine like in the HCubature package.

You’re probably right, but for now the code that this is based on is easier to read with nested quadratures. I’ll be sure to look at HCubature if I optimization becomes necessary. But I’m more interested / surprised about the reasoning for the massive difference in performance. The Julia runtime spends a long time working before it even prints “Inside computation function”.

As an aside, I am finding your exponential integral implementation useful. Thanks!