Unclear allocations with in-place mutating functions

fmor · September 26, 2024, 7:24am

I am fitting a statistical model and I refactored the code to reduce the allocations it generated. I morally halved them, with respect to the first version, but there are still some unclear sections which I see from running my code with the --track-allocation=user option, where I cant understand why they allocate.

The sections involved are the following. In the fitting functon:

        - 		# lC = @MVector zeros(2) # prellocated before this loop
        0 		for k in 1:nclus_red
        - 			# filter the covariates of the units of label k
        0 			aux_idxs = findall_faster(jj -> Si_red[jj] == k, 1:n_red)
        0 			if sPPM
        0 			    copy!(s1o, sp1_red[aux_idxs])
        0 				copy!(s2o, sp2_red[aux_idxs])
        0 				copy!(s1n,s1o); push!(s1n, sp1[j])
        0 				copy!(s2n,s2o); push!(s2n, sp2[j])
  1012512 			spatial_cohesion!(spatial_cohesion_idx, s1o, s2o, sp_params_real, true, M_dp, S,1,false,lC)
  1012512 			spatial_cohesion!(spatial_cohesion_idx, s1n, s2n, sp_params_real, true, M_dp, S,2,false,lC)
        - 			end

In the utils file:

        - function spatial_cohesion!(idx::Real, s1::AbstractVector{Float64}, s2::AbstractVector{Float64}, sp_params::Vector, lg::Bool, M::Real, S=@MMatrix(zeros(2, 2)), case::Int=1, add::Bool=false, lC=@MVector(zeros(2)))
        0 	if idx==1.0 cohesion1!(s1,s2,sp_params[1],lg,M,case,add,lC); return; end
        0 	if idx==2.0 cohesion2!(s1,s2,sp_params[1],lg,M,case,add,lC); return; end
  4525664 	if idx==3.0 cohesion3!(s1,s2,sp_params[1],sp_params[2],sp_params[3],sp_params[4],lg,M,S,case,add,lC); return; end
        0 	if idx==4.0 cohesion4!(s1,s2,sp_params[1],sp_params[2],sp_params[3],sp_params[4],lg,M,S,case,add,lC); return; end
        0 	if idx==5.0 cohesion5!(s1,s2,sp_params[1],lg,M,case,add,lC); return; end
        0 	if idx==6.0 cohesion6!(s1,s2,sp_params[1],lg,M,case,add,lC); return; end
        - end

        - function cohesion3!(s1::AbstractVector{Float64}, s2::AbstractVector{Float64}, mu_0::AbstractVector{Float64}, k0::Real, v0::Real, Psi::AbstractMatrix{Float64}, lg::Bool,  M::Real=1.0, S=@MMatrix(zeros(2, 2)), case::Int=1, add::Bool=false, lC=@MVector(zeros(2)))::Float64
        0 	sdim = length(s1)
        - 	# Compute sample means
        0 	sbar1 = mean(s1)
        0 	sbar2 = mean(s2)
        - 	# Compute deviations from the sample mean
        0 	S .= 0.
        0 	@inbounds for i in 1:sdim
        0 		s_sbar1 = s1[i] - sbar1
        0 		s_sbar2 = s2[i] - sbar2
        - 
        0 		S[1, 1] += s_sbar1 * s_sbar1
        0 		S[2, 2] += s_sbar2 * s_sbar2
        0 		S[2, 1] += s_sbar1 * s_sbar2
        0 	end
        0 	S[1, 2] = S[2, 1] # to avoid repeating computations
        - 	# Updated parameters for cohesion 3
        0 	kn = k0 + sdim
        0 	vn = v0 + sdim
        - 
        - 	sbar = SVector((sbar1, sbar2))
        0 	auxvec1 = sbar .- mu_0
        0 	auxmat1 = auxvec1 * auxvec1'
        - 
        0 	auxconst1 = k0 * sdim
        0 	auxconst2 = k0 + sdim
        0 	Psi_n = Psi .+ S .+ auxconst1 / (auxconst2) .* auxmat1
        - 	
        0 	out = -sdim * logpi + G2a(0.5 * vn, true) - G2a(0.5 * v0, true) + 0.5 * v0 * logdet(Psi) - 0.5 * vn * logdet(Psi_n) + log(k0) - log(kn)
        0 	if add
        0 		lC[case] += lg ? out : exp(out)
        - 	else
        0 		lC[case] = lg ? out : exp(out)
        - 	end
        - end

        - function G2a(a::Real, lg::Bool)
        - 	out = logpi + lgamma(a) + lgamma(a - 0.5)
        - 	return lg ? out : exp(out)
        - end

I dont provide a minimal working example since it’s a bit complex to simulate and since I believe/hope it’s just a theoretical/syntactical detail that I missed (e.g. maybe the overhead of the intermediate function call and not directly cohesion3!). Because working with in-place mutating functions I expected to have no memory issues.

DNF · September 26, 2024, 7:34am

I’m not certain what the above line does, precisely, but I guess it returns a vector, which must be allocated.

This creates a new vector, and must be allocated. You can use view to avoid allocation.

push! will normally allocate.

The rest of your code was a bit difficult to judge on my phone, I could not spot any obvious allocations happening there.

DNF · September 26, 2024, 7:38am

This line allocates a new matrix, it seems.

fmor · September 26, 2024, 7:39am

This one yes it returns a vector of indexes. I just uses findall_faster and not findall since I found a quicker implementation (for my case) in some post around this forum.

For this one sp1_red is already a view, I definded it above as sp1_red = @view sp1[indexes]. But yes I could try to avoid the copy! on s1o and s2o and just pass the view to the function.

Then I guess that the allocations indacted next to the spatial_cohesion! call are maybe the summary of the allocations happened above, for the findall and push?

fmor · September 26, 2024, 7:41am

All the matrices and vectors there are from StaticArrays, also the arguments passed (except for s1 and s2), so maybe the compiler finds a way to avoid allocating them, as I thought from this test here:

julia> @btime cohesion3!($s1,$s2,$(sp_params[1]),$(sp_params[2]),$(sp_params[3]),$(sp_params[4]),$lg,$M,$S,$case,$add,$lC)
  195.930 ns (0 allocations: 0 bytes)

DNF · September 26, 2024, 7:43am

That doesn’t help, a slice of a view will allocate. You must make a view of the view.

giordano · September 26, 2024, 7:58am

To clarify a couple of points

the function_name! naming scheme to denote in-place functions is purely a convention, not enforced by the compiler or anything
the fact that a function is operating in-place on some of the arguments means just that: it’s a visual warning that some of the arguments are (supposed to be) being modified in-place, but the function may or may not need to allocate extra memory internally. That said, many in-place functions effectively don’t allocate extra memory because they manage to do everything with the input arguments, but others may still need to allocate memory.

fmor · September 26, 2024, 8:05am

About the exclamation mark I knew, it alone cant do anything magic. About the second point yes maybe I expected that operating in-place would remove all allocations, but indeed if there are auxiliary computations they could require some memory. Therefore I think I could be satisfied with the current implementation.

barucden · September 26, 2024, 8:13am

This is not an answer to your question, but anyway, why is idx (that is spatial_cohesion_idx) a Real? If it is an index, shouldn’t it be an Int (and then compare idx==1, idx==2, …)? If it is truly supposed to be a real number (represented as a float), exact comparison (idx==1.0) is suspicious.

fmor · September 26, 2024, 8:16am

Yes it should be an Int, but I am writing a fitting function in Julia which is actually intended to be used from R, trough the JuliaConnectoR library. And that library morally converts all numerical inputs to Float64 (well it can convert R integers to Julia Ints, but when I write a number on R it treats it as a double by default, so I would have to write explicitly the conversion to integer on R to make the input be a Int on Julia. But yes maybe I should impose that conversion, as it could be more efficient then in Julia).

# R console
> x=2
> typeof(x)
[1] "double"
> typeof(as.integer(x))
[1] "integer"

So I wrote Real to make it work from R with the strange conversion, but to keep it working also from Julia using more natural Ints.

DNF · September 26, 2024, 8:56am

I just now noticed logpi. Is it a (non-const) global variable? Because that could be causing allocations.

Sorry for going off topic, but I am not familiar with this usage of the word ‘morally’, and it’s slightly confusing. Did you mean to use a different word?

fmor · September 26, 2024, 9:01am

logpi is indeed a global but constant. I wrote this at the beginning of my utils.jl file:

const logpi = log(π)
const log2pi = log(2*π)

About morally yes, sorry, it’s just my way in which I commonly express it was just to say “approximately”. Like judging on my tests, of which however I did not mantain precise notes so I cant be more precise, I halved the allocations. Or about the numerical inputs they get in practice always converted to Floats unless for some further work of the user (e.g. writing explicitly the conversion).

DNF · September 26, 2024, 9:06am

That should be OK then👍

As for ‘morally’ that refers to judgements about right and wrong, good and bad, ethical questions. I would suggest “approximately/roughly” and “normally” in stead

foobar_lv2 · September 26, 2024, 1:02pm

FWIW, this is pretty standard use in my parts of maths. Stuff is “morally convex / linear / whatever” if you want to express something intentionally vague, since you don’t want to nameclash with precisely defined notions near your context, like e.g. pseudo / quasi / semi / predominantly / generically / virtually / almost / whatever. (with the added connotation that one would naively have plausible intuitive reasons for expecting the thing to hold with some minor qualifications)

I think “kinda linear” would also be very clear and precise intentionally vague communication that has no risk of being overloaded with precise definitions, but some people consider that unprofessional language.

Cf e.g. meaning 2 here, which suggests as synonym “virtually”. But in CS-related contexts, “virtually” would be a very bad word choice, due to so many precise technical terms that nameclash.

DNF · September 26, 2024, 6:50pm

Huh. I thought it had to be a flat-out mistake by a non-native speaker. Sorry, @fmor !

In my defence, I very rarely come across usages that I’ve never heard at all.

Topic		Replies	Views
Too many allocations in functions call General Usage	6	214	January 25, 2025
Prevent huge number of allocations mutating columns of arrays Performance	15	543	September 19, 2023
Controlling memory allocation using external struct as placeholder General Usage question	5	869	March 11, 2017
Inappropriate memory allocation? Julia at Scale question , memory	24	2101	September 4, 2017
Spurious memory allocations within function Performance question , memory-allocation	9	435	March 22, 2023

Unclear allocations with in-place mutating functions

Related topics