Fastest way possible to find index of value equals 1 across a matrix column

eduardosalaz · October 2, 2023, 1:55am

Hi there everyone. I am currently working on my thesis which involves an optimization problem in which I have an X decision matrix filled with 0s, each column must have a 1 value as it represents an allocation.
In my local search algorithm I iterate over the matrix as follows:

S, B = size(X)
for j₁ in 1:B
  ĩ = findfirst(==(1), X[:, j₁])
  for j₂ in j₁+1:B
    i✶ = findfirst(==(1), X[:, j₂])
    # compute the move with ĩ and i✶ ...
  end
end

However after timing and benchmarking I find that a significant amount of time is spent on the i✶ = findfirst(==(1), X[:, j₂]) line which made me ask myself: What is the fastest way out there to find which index equals one across my entire column? I am talking about absolute blazing speed as this computation is performed hundreds of thousands of times. Any tricks or tips or memory shenanigans to do so? My matrix is not particularly large but it’s not trivially small either (eg. 500 rows, 2650 columns, double that).

For reference, here is a TimerOutputs output:

| Section | ncalls | time   | %tot   | avg    | alloc   | %tot   | avg     |
|---------|--------|--------|--------|--------|---------|--------|---------|
| Loop j1 | 6      | 63.5s  | 100.0% | 10.6s  | 81.7GiB | 100.0% | 13.6GiB |
| ĩ       | 15.9k  | 31.9ms | 0.1%   | 2.01μs | 63.1MiB | 0.1%   | 4.06KiB |
| Loop j2 | 15.9k  | 63.4s  | 99.9%  | 3.99ms | 81.6GiB | 99.9%  | 5.25MiB |
| i⋆      | 21.1M  | 39.1s  | 61.5%  | 1.85μs | 81.6GiB | 99.9%  | 4.06KiB |

Edit: wow I feel dumb, after posting this I saw the number of calls to i✶ and no wonder it spends so much time there, it gets called millions of times ):

Oscar_Smith · October 2, 2023, 1:58am

Also note that X[:, j₂] makes a copy. you might want to either use @views.

Oscar_Smith · October 2, 2023, 1:59am

Also if your matrix has very few 1 values, you should consider using a sparse matrix.

eduardosalaz · October 2, 2023, 2:12am

What’s a good rule of thumb to know when to use @views? It’s still not very clear to me when or when not to use it. Thank you very much. Using the macro brought times down significantly so it’s very useful towards achieving my goal.

Oscar_Smith · October 2, 2023, 2:26am

You should use @views whenever you take a slice of an array (i.e. any indexing that has a : in it), unless you actually want a copy. This is one of my least favorite parts about Julia but it is unfortunately pretty much impossible to fix.

Dan · October 2, 2023, 7:24am

Looking at the code, it would be strange if the the # compute the move part changes X. Therefore, the for loops basically look for all pairs of initial “1” elements in the columns. The way the pairs are found is searching for the same “1” many times over a column. Furthermore, there might be a problem of stability if a column does not contain a “1” (which may never occur for reasons specific to this case).

In any case, looking for the initial “1” once (like in dosomething2() is better, i.e.

julia> using Random

julia> Random.seed!(124)

julia> X = rand(10,10) .< 0.2;

julia> function dosomething(X)
           s=0
           S, B = size(X) 
           for j₁ in 1:B
               ĩ = findfirst(==(1), @view X[:, j₁])
               for j₂ in j₁+1:B
                   i✶ = findfirst(==(1), @view X[:, j₂])
                   s += ĩ * i✶
               end
           end
           s
       end
dosomething (generic function with 1 method)

julia> function dosomething2(X)
           s=0
           S, B = size(X)
           js = [findfirst(==(1), @view X[:, i]) for i in 1:B]
           for j₁ in 1:B
               for j₂ in j₁+1:B
                   s += js[j₁]*js[j₂]
               end
           end
           s
       end
dosomething2 (generic function with 1 method)

julia> using BenchmarkTools

julia> @btime dosomething($X)
  237.532 ns (0 allocations: 0 bytes)
1240

julia> @btime dosomething2($X)
  152.856 ns (1 allocation: 144 bytes)
1240

As shown, this different approach gives a 40% speedup even with a small matrix and the speedup increases with X’s size!

P.S. js can also be calculated with:

js = findfirst.(==(1), eachcol(X))

which is a bit shorter.

eduardosalaz · October 6, 2023, 4:15pm

Thank you for your input! However my current approach does mean that my X is changed during the iterations so precalculating which indices are where is out of the window.

Dan · October 6, 2023, 4:32pm

I see. These mutating algorithms are tricky indeed. In any case, the no findfirst approach can still be used, if the first 1 are dynamically kept up to date (when assigning to X a 1 fixup the initial 1 vector. If X mutates violently, this cannot be done, but I’ve given my input

Topic		Replies	Views
Can this loop be optimized further? Performance question	7	735	December 20, 2019
Perform a multi-column, multi-dimensional search quickly Performance question , performance , broadcast	9	537	April 30, 2022
Skipping over loop index Performance	2	1158	May 22, 2019
Using findfirst with for multiple values and columns in a Dataframe? New to Julia dataframes	21	514	March 24, 2024
How do I get the indexes for matrix columns whose elements are all the same? General Usage	12	595	April 13, 2023

Fastest way possible to find index of value equals 1 across a matrix column

Related topics