Avoiding allocations when normalizing a vector

Domenico_Lahaye · May 6, 2024, 2:49pm

Hello.

Given that calling function

function integrand(r,rp)
    diff = MArray(r-rp)
    return normalize!(diff) 
end

as in

r = Point3D(0.,0,0); rp = Point3D(5.,0,0)
@btime integrand(r,rp)

results in one (single) allocation, I wonder how to avoid allocations in calling this same function multiple times. Currently

function repeatedIntegrand(N)
    rp = Point3D(5.,0,0);
    r  = Point3D(0.,0,0)
    for i=1:N 
        integrand(r,rp)
    end 
end

results in N allocations. This renders numerical integration of the integrand to be slow. Thx!

stevengj · May 6, 2024, 4:21pm

Don’t use an MArray? Why not just do integrand(r, rp) = normalize(r - rp)?

Realize that working with small static arrays like this doesn’t allocate anything on the heap. You can freely create “new” static arrays just like you don’t worry about creating “new” numbers when you compute things like x + 1 for scalars. You don’t need to worry about “in-place” operations on SVector or similar, in the same way that you don’t worry about “in-place” operations on scalars.

Domenico_Lahaye · May 6, 2024, 4:45pm

Many thx once again.

Proceeding as you suggest results in

function integrand(r,rp)
    return normalize(r-rp)
end

r = Point3D(0.,0,0); rp = Point3D(5.,0,0)

@btime hcubature(rp->integrand(r, Point3D(rp[1],rp[2],0)), (0,0), (1,1))[1]

with output

2.024 ms (59838 allocations: 2.02 MiB)

Is my limited understanding correct that you consider this to be harmless?

What is a good way to profile code such that harmless and harmful allocations are distinguished?

Salmon · May 6, 2024, 5:03pm

you are benchmarking without interpolating the arguments which usually allocates a bit (this allocation would not be there in real use cases)

instead try

@btime hcubature($(rp->integrand(r, Point3D(rp[1],rp[2],0))), $((0,0)), $((1,1)))[1]

to be safe.

I think there will probably still be some allocations due to logic in the cubature routine. Probably those cannot be avoided.
But i think they are not critical to performance, usually you would see some GC time being reported there

Domenico_Lahaye · May 6, 2024, 5:13pm

Thx!

The alternative call

@btime hcubature($(rp->integrand(r, Point3D(rp[1],rp[2],0))), $(0,0), $(1,1))[1]

results in

2.022 ms (59838 allocations: 2.02 MiB)

i.e., same numbers as before.

Where do you expect numbers for GC time to show up?

stevengj · May 6, 2024, 5:22pm

Harmless or not, it’s not under your control — your integrand itself should no longer be allocating (as you can easily verify), so I assume the allocations are coming from hcubature (which maintains a priority queue of subregions to integrate, and requires allocations as it refines). See below, you still have allocations from a non-constant global.

(There are probably some untapped opportunities in hcubature to reduce allocations or to support preallocates data structures, but it requires getting into the guts of the algorithm.)

However, as I’ve commented on other threads, be careful that this sort of integrand may have a singularity depending on the value of r, which is could cause quadrature convergence to be very slow (require lots of function evaluations). There are various techniques (many from the integral-equation community) to remove integrable singularities by a transformation of the integrand and/or a change of variables. See e.g. Numerical integration over 3D domain of vector-valued function - #10 by stevengj

stevengj · May 6, 2024, 5:32pm

That’s not enough, because your variable r is still a non-constant global variable.

If you put this all into a function it should be better:

using LinearAlgebra, StaticArrays, HCubature, BenchmarkTools

const Point3D = SVector{3,Float64}
integrand(r, rp) = normalize(r - rp)
foo(r) = hcubature(rp->integrand(r, Point3D(rp[1],rp[2],0)), (0,0), (1,1))[1]

gives

julia> @btime foo($(Point3D(0,0,0)));
  67.375 μs (5 allocations: 65.92 KiB)

for me.

(Using your definition of Point3D from your other thread, without which your code is not runnable. Please try to make self contained runnable examples in your posts.)

Domenico_Lahaye · May 6, 2024, 5:38pm

Yes, true, see this as well.

Domenico_Lahaye · May 6, 2024, 5:45pm

Yes, true, I briefly looked into the scuff-EM/SingularIntegrals code.

Is it wise to develop a Julia/Rust/otherwise wrapper around this code allowing its re-use?

Thank you so much.

stevengj · May 6, 2024, 5:47pm

scuff-EM isn’t actively maintained at this point, so I’m not sure it’s worth writing Julia wrappers unless you want to take on maintenance of the underlying C++ library.

On the other hand, I’d love to see a Julia implementation of the underlying generalized Taylor–Duffy singularity subtraction algorithm, using e.g. Symbolics.jl instead of Mathematica (as in the paper) for code generation.

Domenico_Lahaye · May 6, 2024, 6:01pm

Excellent.

It is my understanding that this requires the following two ingredients:

1/ an analytical part to handle the singularity of the integrand, using e.g. Symbolics.jl as you suggest;

2/ a numerical part that can handle (integral-given) - (intergral-analytical-approximation);

Is this a good overview? Do I miss essential parts?

Could https://docs.sciml.ai/Symbolics/stable/examples/perturbation/ serve as ingredient for part 1?

stevengj · May 6, 2024, 7:08pm

Essentially, though part (2) is any cubature routine, so it’s already available. The key part is the analytical transformation of the integral to something nonsingular.

Topic		Replies	Views
Avoiding memory allocation when repeatedly using small arrays Performance	5	492	July 25, 2021
Assistance in avoiding allocations in summations Performance	8	789	March 18, 2019
How to avoid allocations when multiplying numbers with missings Performance memory-allocation	15	876	February 20, 2021
How To Avoiding Allocations in Static Structs Performance array , memory-allocation , struct	4	416	December 5, 2022
Reducing allocations when subtracting vectors and multiplying by scalars Performance memory-allocation , optimization	9	453	November 9, 2021

Avoiding allocations when normalizing a vector

Related topics