Avoiding allocations when normalizing a vector


Given that calling function

function integrand(r,rp)
    diff = MArray(r-rp)
    return normalize!(diff) 

as in

r = Point3D(0.,0,0); rp = Point3D(5.,0,0)
@btime integrand(r,rp) 

results in one (single) allocation, I wonder how to avoid allocations in calling this same function multiple times. Currently

function repeatedIntegrand(N)
    rp = Point3D(5.,0,0);
    r  = Point3D(0.,0,0)
    for i=1:N 

results in N allocations. This renders numerical integration of the integrand to be slow. Thx!

Don’t use an MArray? Why not just do integrand(r, rp) = normalize(r - rp)?

Realize that working with small static arrays like this doesn’t allocate anything on the heap. You can freely create “new” static arrays just like you don’t worry about creating “new” numbers when you compute things like x + 1 for scalars. You don’t need to worry about “in-place” operations on SVector or similar, in the same way that you don’t worry about “in-place” operations on scalars.

Many thx once again.

Proceeding as you suggest results in

function integrand(r,rp)
    return normalize(r-rp)

r = Point3D(0.,0,0); rp = Point3D(5.,0,0)

@btime hcubature(rp->integrand(r, Point3D(rp[1],rp[2],0)), (0,0), (1,1))[1]

with output

2.024 ms (59838 allocations: 2.02 MiB)

Is my limited understanding correct that you consider this to be harmless?

What is a good way to profile code such that harmless and harmful allocations are distinguished?

you are benchmarking without interpolating the arguments which usually allocates a bit (this allocation would not be there in real use cases)

instead try

@btime hcubature($(rp->integrand(r, Point3D(rp[1],rp[2],0))), $((0,0)), $((1,1)))[1]

to be safe.

I think there will probably still be some allocations due to logic in the cubature routine. Probably those cannot be avoided.
But i think they are not critical to performance, usually you would see some GC time being reported there


The alternative call

@btime hcubature($(rp->integrand(r, Point3D(rp[1],rp[2],0))), $(0,0), $(1,1))[1]

results in

2.022 ms (59838 allocations: 2.02 MiB)

i.e., same numbers as before.

Where do you expect numbers for GC time to show up?

Harmless or not, it’s not under your control — your integrand itself should no longer be allocating (as you can easily verify), so I assume the allocations are coming from hcubature (which maintains a priority queue of subregions to integrate, and requires allocations as it refines). See below, you still have allocations from a non-constant global.

(There are probably some untapped opportunities in hcubature to reduce allocations or to support preallocates data structures, but it requires getting into the guts of the algorithm.)

However, as I’ve commented on other threads, be careful that this sort of integrand may have a singularity depending on the value of r, which is could cause quadrature convergence to be very slow (require lots of function evaluations). There are various techniques (many from the integral-equation community) to remove integrable singularities by a transformation of the integrand and/or a change of variables. See e.g. Numerical integration over 3D domain of vector-valued function - #10 by stevengj

That’s not enough, because your variable r is still a non-constant global variable.

If you put this all into a function it should be better:

using LinearAlgebra, StaticArrays, HCubature, BenchmarkTools

const Point3D = SVector{3,Float64}
integrand(r, rp) = normalize(r - rp)
foo(r) = hcubature(rp->integrand(r, Point3D(rp[1],rp[2],0)), (0,0), (1,1))[1]


julia> @btime foo($(Point3D(0,0,0)));
  67.375 μs (5 allocations: 65.92 KiB)

for me.

(Using your definition of Point3D from your other thread, without which your code is not runnable. Please try to make self contained runnable examples in your posts.)

1 Like

Yes, true, see this as well.

Yes, true, I briefly looked into the scuff-EM/SingularIntegrals code.

Is it wise to develop a Julia/Rust/otherwise wrapper around this code allowing its re-use?

Thank you so much.

scuff-EM isn’t actively maintained at this point, so I’m not sure it’s worth writing Julia wrappers unless you want to take on maintenance of the underlying C++ library.

On the other hand, I’d love to see a Julia implementation of the underlying generalized Taylor–Duffy singularity subtraction algorithm, using e.g. Symbolics.jl instead of Mathematica (as in the paper) for code generation.


It is my understanding that this requires the following two ingredients:

1/ an analytical part to handle the singularity of the integrand, using e.g. Symbolics.jl as you suggest;

2/ a numerical part that can handle (integral-given) - (intergral-analytical-approximation);

Is this a good overview? Do I miss essential parts?

Could https://docs.sciml.ai/Symbolics/stable/examples/perturbation/ serve as ingredient for part 1?

Essentially, though part (2) is any cubature routine, so it’s already available. The key part is the analytical transformation of the integral to something nonsingular.

1 Like