The first for loop execution

Hi,

I wrote a simple test in Julia, I have a large sparse matrix, and I want to iterate throughout non-zero elements, but the first execution of the for loop is slow. The minimal working example is bellow. Did I do something wrong here?

using SparseArrays

function test()
    H = sprand(30000,70000,0.001)
    cidx = findall(!iszero, H)
    G = spzeros(30000,70000)

    for k = 1:2
        @time begin
        for i in cidx
            G[i] =  1
        end
        end
    end
end

test()

 23.617762 seconds (42 allocations: 66.001 MiB, 0.40% gc time)
  0.028947 seconds

You are inserting elements into an empty sparse matrix (G) which is extremely inefficient. However, the second loop those entries are already populated so then it is much faster to insert.

5 Likes

Oh, nice, if I define G as H, then there is no problem.

Thank you

One way to do this efficiently, using only 2 lines of code, is as follows:

julia> using SparseArrays, BenchmarkTools

julia> function test()
           H = sprand(30000, 70000, 0.001)
           G = map(x -> !iszero(x) ? 1.0 : 0.0, H)
       end
test (generic function with 1 method)

julia> @btime test();
  79.437 ms (23 allocations: 82.19 MiB)

If that is really your situation, you can use G = sprand(Bool,30000, 70000, 0.001) and skip all the rest.

1 Like

You’re right, this is not really my situation. I wanted to point out that the first pass is slow. However, fredrikekre gave the explanation.

I was surprised to see that sprand(0:1, 30000, 70000, 0.001) didn’t work, since both rand(Bool) and rand(0:1) do. Any idea how to achieve this?