How to improve performance in nested loops

claudio20497 · October 26, 2021, 2:14pm

Hello,

I’m writing a function whose performance suffers because of a nested for-loop. Below is a simplified version of that function.

The function f() takes no arguments, and instantiates vecvec a vector of vectors of Int64, all of the same size. Then, looping from i = 1 to 5, for each element of vecvec an operation that involves a slice of that element, the i index and a random component is performed. If the result of such operation is positive, the result is added to the element of vecvec, else the element must be removed from vecvec.

Here is the code:

function f()

    vecvec::Vector{Vector{Int64}}  = [collect(i:(i+10)) for i in 1:10000]
    to_be_removed = Set{Int64}()

    for i in 1:5

        for (j,arr) in enumerate(vecvec)

            diff::Int64 = rand(Int64) + sum(arr[i:i+5]) - i
            diff>0 ? push!(arr,diff) : push!(to_be_removed,j)

        end

        vecvec = [arr for (k,arr) in enumerate(vecvec) if k ∉ to_be_removed]
        empty!(to_be_removed)

    end

    return vecvec

end

Current performance is:

julia> using BenchmarkTools
julia> @btime f();
  1.600 ms (33882 allocations: 5.39 MiB)

Would you know any way to make it faster ?

Improvements

@lmiq 's suggestion helped bring it down to:

julia> @btime f();
  1.096 ms (14871 allocations: 3.07 MiB)

@mcabbott 's suggestion, that I interpreted this way:

function f()

    vecvec::Matrix{Int64} = hcat([collect(i:(i+10)) for i in 1:10000]...)

    to_be_kept = Int64[]
    new_row = Int64[]
    

    for i in 1:5
        for (j,arr) in enumerate(eachcol(vecvec))

            diff::Int64 = rand(Int64) + sum(@view(arr[i:i+5])) - i

            if diff > 0
                push!(new_row,diff)
                push!(to_be_kept,j)
            end

        end
        vecvec = [@view(vecvec[:,to_be_kept]) ; new_row']
        empty!(to_be_kept)
        empty!(new_row)
    end

    return vecvec

end

Leads to:

julia> @btime new_f();
  1.062 ms (10065 allocations: 3.90 MiB)

lmiq · October 26, 2021, 2:36pm

use a view here: sum(@view(arr[i:i+r])).

try to remove all allocations which are not completely on purpose.

mcabbott · October 26, 2021, 2:46pm

Can you avoid having a vector of vectors, and resizing half of them? Perhaps if these are columns of a matrix, and you keep track of which to keep, and what to append, then at the end you can make a new matrix which has one more row.

simsurace · October 26, 2021, 10:34pm

This is about three times faster on my system:

function g!(vec, i)
    sum = 0
    length(vec) >= i + 5 && @inbounds for i0 in 1:6
        sum += vec[i0+i-1]
    end
    diff = rand(Int) + sum - i
    push!(vec, diff)
end

mapg!(vecvec, i) = map(vec -> g!(vec, i), vecvec)

function G!(vecvec)
    for i in 1:5
        mapg!(vecvec, i)
        filter!(vec -> last(vec) <= 0, vecvec)
    end
    return vecvec
end

function f()
    vecvec = [collect(i:(i+10)) for i in 1:10000]
    G!(vecvec)
    return vecvec
end

Topic		Replies	Views
Improving performance of a nested for loop New to Julia performance , loops	14	5975	October 10, 2019
Significant difference in execution speed between similar functions with nested loops Performance question	4	239	March 8, 2023
How to improve performance of sum() Performance question , fast-math , tullio , loopvectorization , sum	19	4801	July 19, 2021
Speeding up a nested for loop Performance performance , linearalgebra	5	880	September 23, 2023
For loop optimization Performance performance , memory-allocation , loops	3	1657	October 31, 2021

How to improve performance in nested loops

Improvements

Related topics