I designed and implemented a method where the core task is to evaluate some 2D integrals using Gauss-Legendre methods in Julia. For practical reasons, the final code has to be in Java; the translation into Java is about 2.5-3 times faster without any effort at optimization.

I spent some time examining the Julia implementation to find bottlenecks, and the most obvious sign of poor design/performance is that benchmarking shows large number of allocations and a corresponding large memory usage (about 1GB compared to a data array that is only 61 by 61, with Float64 elements).

Iāve isolated a couple of the core functions to create an MWE; the times and memory usage are very small, but the functions are called many times in the real application and *I think* what is happening is that every time I call these functions there are repeated allocations.

For this reason, Iām looking for advice on how to address this. Any tips would be greatly appreciated. The key point is the that integrand function only has 6 allocations, but the integrator incurs 40 - Iād like to reduce that number.

In the code examples below, I have a function `integrator`

that calls a function `f```` that evaluates the integrand. To simplify setting variables, I also have a file`

setVals.jl``` that allows setting the parameters to benchmark the functions. So, my steps are:

`include("setVals.jl")`

to set variables to benchmark functions. Then, I

```
@benchmark f($Mmatrix,$x,$yVec,$c,$d,$wInt)
BenchmarkTools.Trial:
memory estimate: 832 bytes
allocs estimate: 40
--------------
minimum time: 765.275 ns (0.00% GC)
median time: 802.927 ns (0.00% GC)
mean time: 942.206 ns (10.27% GC)
maximum time: 435.468 Ī¼s (99.77% GC)
--------------
samples: 10000
evals/sample: 109
```

```
@benchmark integrator($a,$b,$y1,$y2,$x,$Mmatrix,$c,$d)
BenchmarkTools.Trial:
memory estimate: 5.77 KiB
allocs estimate: 243
--------------
minimum time: 5.366 Ī¼s (0.00% GC)
median time: 5.651 Ī¼s (0.00% GC)
mean time: 7.423 Ī¼s (19.29% GC)
maximum time: 8.354 ms (99.86% GC)
--------------
samples: 10000
evals/sample: 6
```

Here are the contents of the file setting the two functions:

```
StaticArrays
function integrator(a::Float64,b::Float64,y1::Float64,y2::Float64,x::Array{Float64,1},M::SMatrix{3,3,Float64},
c::Float64,d::Float64)
limitDiff = (b-a)/2.0
limitAvg = (b+a)/2.0
integrand = map(theZ->f(M,x,[y1,y2,limitDiff*theZ+limitAvg],c,d,wInt), zInt)
limitDiff*(wInt'*integrand)
end
function f(M::SMatrix{3,3,Float64},x::Array{Float64,1},y::Array{Float64,1},c::Float64,d::Float64,wInt::SVector{5,Float64})
if ((x[1]-y[1])^2 + (x[2]-y[2])^2)<Base.eps(x[1])
Īø=0.0
else
Īø=atan(((x[1]-y[1])^2 + (x[2]-y[2])^2)^0.5, (x[3]-y[3]))
end
if abs( x[2]-y[2] )<Base.eps(x[1])
Ļ=0.0
else
Ļ = atan(x[2]-y[2], x[1]-y[1])
end
Ī³ = @SVector [sin(Īø)cos(Ļ), sin(Īø)sin(Ļ) , cos(Īø)]
theVal = ( (-2 + 15Ī³[1]^2 - 15Ī³[1]^4 + (3-4d)*(1-3Ī³[1]^2))*M[1,1]
+ (6Ī³[1]Ī³[2] - 3(3-4b)Ī³[1]Ī³[2] - 15Ī³[1]^3*Ī³[2])*M[1,2]
+ (6Ī³[1]Ī³[3] - 3(3-4b)Ī³[1]Ī³[3] - 15Ī³[1]^3*Ī³[3])*M[1,3]
+ (9Ī³[1]Ī³[2] - 15Ī³[1]^3*Ī³[2])*M[2,1]
+ (-1 + 3Ī³[1]^2 + 3Ī³[2]^2 - 15Ī³[1]^2*Ī³[2]^2)*M[2,2]
+ (3Ī³[2]Ī³[3] - 15Ī³[1]^2*Ī³[2]*Ī³[3])*M[2,3]
+ (9Ī³[1]Ī³[3] - 15Ī³[1]^3*Ī³[3])*M[3,1]
+ (3Ī³[2]Ī³[3] - 15Ī³[1]^2*Ī³[2]Ī³[3])*M[3,2]
+ (-1 + 3Ī³[1]^2 + 3Ī³[3]^2 - 15Ī³[1]^2*Ī³[3]^2)*M[3,3] )
# avoid singularity where output point is on fracture (or too close).
rCubed = r(x,y)^3
if( rCubed < 10^-5 )
rCubed=10^-5
end
return(theVal/(16*Ļ*c*(1-d)*rCubed))
end
function r(x::Array{Float64,1}, Ī¾::Array{Float64,1})
((x-Ī¾)'*(x-Ī¾))^0.5
end
```

Here are the contents of the `setVals.jl`

file:

```
using StaticArrays
using FastGaussQuadrature
a = -15.0
b = -14.0
x=[1.,1.,0.]
Mmatrix = @SMatrix [10.0e9 0 0; 0 0 0; 0 0 0]
zInt0,wInt0=gausslegendre(5)
zInt=@SVector [zInt0[i] for i in 1:5]
wInt=@SVector [wInt0[i] for i in 1:5]
c=1.3e9
d=0.25
y1=-15.0
y2=-2.
yVec=[y1; y2; -15.0]
```