Hello,
I’m writing a function whose performance suffers because of a nested for-loop. Below is a simplified version of that function.
The function f()
takes no arguments, and instantiates vecvec
a vector of vectors of Int64
, all of the same size. Then, looping from i = 1
to 5
, for each element of vecvec
an operation that involves a slice of that element, the i
index and a random component is performed. If the result of such operation is positive, the result is added to the element of vecvec
, else the element must be removed from vecvec
.
Here is the code:
function f()
vecvec::Vector{Vector{Int64}} = [collect(i:(i+10)) for i in 1:10000]
to_be_removed = Set{Int64}()
for i in 1:5
for (j,arr) in enumerate(vecvec)
diff::Int64 = rand(Int64) + sum(arr[i:i+5]) - i
diff>0 ? push!(arr,diff) : push!(to_be_removed,j)
end
vecvec = [arr for (k,arr) in enumerate(vecvec) if k ∉ to_be_removed]
empty!(to_be_removed)
end
return vecvec
end
Current performance is:
julia> using BenchmarkTools
julia> @btime f();
1.600 ms (33882 allocations: 5.39 MiB)
Would you know any way to make it faster ?
Improvements
- @lmiq 's suggestion helped bring it down to:
julia> @btime f();
1.096 ms (14871 allocations: 3.07 MiB)
- @mcabbott 's suggestion, that I interpreted this way:
function f()
vecvec::Matrix{Int64} = hcat([collect(i:(i+10)) for i in 1:10000]...)
to_be_kept = Int64[]
new_row = Int64[]
for i in 1:5
for (j,arr) in enumerate(eachcol(vecvec))
diff::Int64 = rand(Int64) + sum(@view(arr[i:i+5])) - i
if diff > 0
push!(new_row,diff)
push!(to_be_kept,j)
end
end
vecvec = [@view(vecvec[:,to_be_kept]) ; new_row']
empty!(to_be_kept)
empty!(new_row)
end
return vecvec
end
Leads to:
julia> @btime new_f();
1.062 ms (10065 allocations: 3.90 MiB)