Most performant way of updating vector values?

Gaussia · September 11, 2023, 6:39pm

Several times in my code, I will update the values of a vector. I want to avoid allocations and other stuff, and make it as performant as possible. At the beginning the vector is preallocated with n dummy values, and then I will update it. What is the best way?

E.g, I have:

n = 100
work_values = zeros(n)
function new_val(i)
    return 2i+5
end

and I now want to set each of the n values to the value according to i. Doing either:

work_values = map(i -> new_val(i), 1:n)
work_values = [new_val(i) for i in 1:n]
work_values .= new_val.(1:n)

I am afraid might cause some problem due to allocation/reallocation, but I am really not very certain about this. Alternatively, I could try:

foreach(i -> work_values[i] = new_val(i), 1:n)

but at some point, someone told me that foreach could cause problems with performance. Finally, maybe

for i in 1:n
    work_values[i] = new_val(i)
end

would be the best, but putting in a full for loop also feels a bit excessive.

(after creating the initial work_values vector, I will update its content many times, each time with different values, not just with what new_val gives n this example)

martin.d.maas · September 11, 2023, 6:57pm

The for loop will work without allocations, for sure.

work_values .= new_val.(1:n)

Are you sure this pattern allocates? Note that if you have something more complex than 1:n that could be triggering allocations, and not the broadcasted equal sign.

using BenchmarkTools

function set_values!(a,b)
       b .= a
end

a = LinRange(0,1,100)
b = zeros(100)

@btime set_values!(a,b)
128.973 ns (0 allocations: 0 bytes)

lmiq · September 11, 2023, 6:59pm

Note that these two are not updating the values, but creating new arrays. In the case of mapping, you want to use map!:

julia> x = [1,2,3];

julia> map!(i -> 2i, x, 1:3);

julia> x
3-element Vector{Int64}:
 2
 4
 6

Gaussia · September 11, 2023, 6:59pm

I had someone cautioning me on using broadcasting, but again, it might be dependent on the setting and fine in general. if it work, something like the work_values .= new_val.(1:n) seems good.

Gaussia · September 11, 2023, 7:00pm

Get it, so using map! might be good? Or would I have the same problems as I (might) have when using an anonymous function in the foreach loop?

lmiq · September 11, 2023, 7:03pm

map! is fine. The broadcasing is also fine. The foreach can be problematic in the global scope, because then you are modifying a global variable, but within a function it is also fine:

julia> updatex(f,x,r) = foreach(i -> x[i] = f(i), r)
updatex (generic function with 1 method)

julia> x = [1,2,3];

julia> @btime updatex(i -> 2i, $x, 1:3)
  1.847 ns (0 allocations: 0 bytes)

julia> x
3-element Vector{Int64}:
 2
 4
 6

(but map! and the broadcasting are by far more idiomatic for this)

Gaussia · September 11, 2023, 7:05pm

Got it, thanks a lot for the help

DNF · September 11, 2023, 7:15pm

Another vote for broadcasting here. It’s both performant and the most elegant and idiomatic. I wonder why someone cautioned you against it.

Gaussia · September 11, 2023, 7:18pm

I think it might have been something else in the situation that made broadcasting problematic, and I erroneously overinterpreted it as to be generally cautious.

abraemer · September 13, 2023, 5:39am

Maybe it’s because the left-most “.” (in .=) is doing a lot of heavy lifting here but is very inconspicuous. Omission leads to identical results just with allocation, which might be the reason people think broadcasting leads to allocation. FWIW, I remember 2 topics here on discourse where someone forgot/didn’t think about that dot and asked for performance help.