I’ve run into a kind-of counterintuitive behavior when coding some example. I have an array
A_field, which I want to update by adding certain values in certain positions. These are provided by a an array of arrays of (flattened) indices
Part and an array of arrays of values
A, as follows:
N = 10 shapeA = (N, N) A_field = zeros(shapeA) Part = [[i, i+1] for i in 1:2:prod(shapeA)] A = [rand(length(J)) for J in Part]
You may want to think of this as having split some big problem for
A_field into small chunks, solved them independently and then wanting to join the results.
The most straightforward implementation of the updating I can think of is the following:
function update1!(A_field, Part, A) for (i, J) in enumerate(Part) A_field[J] += A[i] end end
But this strangely results in as many allocations as entries of
julia> @time update1!(A_field, Part, A) 0.000010 seconds (100 allocations: 9.375 KiB)
Which of course is not nice for big problems. (Disclaimer: I used all combinations of
.+= instead of
+= and so on I could think of, and this is the one that allocated the least). A second possibility is to introduce a second loop over each
function update2!(A_field, Part, A) for (k, J) in enumerate(Part) for (i, j) in enumerate(J) A_field[j] += A[k][i] end end end
Which is slightly more cumbersome to read and understand. This of course introduces no allocations:
julia> @time update2!(A_field, Part, A) 0.000006 seconds
Finally, reshaping the array and using views seems to get rid of the allocations (apart from the ones needed for the reshaping, of course):
function update3!(A_field, Part, A) A_temp = reshape(A_field, :) for (i, J) in enumerate(Part) @views A_temp[J] .+= A[i] end end
julia> @time update3!(A_field, Part, A) 0.000009 seconds (2 allocations: 80 bytes)
So at this point I am a bit confused and have a lot of questions.
- Why does the 3rd version work well and not the 1st one?
- Is there any way (macro, trick) to obtain the performance of the 2nd version without sacrificing the readability of the 1st version?