Hi there everyone. I am currently working on my thesis which involves an optimization problem in which I have an X decision matrix filled with 0s, each column must have a 1 value as it represents an allocation.
In my local search algorithm I iterate over the matrix as follows:

S, B = size(X)
for j₁ in 1:B
ĩ = findfirst(==(1), X[:, j₁])
for j₂ in j₁+1:B
i✶ = findfirst(==(1), X[:, j₂])
# compute the move with ĩ and i✶ ...
end
end

However after timing and benchmarking I find that a significant amount of time is spent on the i✶ = findfirst(==(1), X[:, j₂]) line which made me ask myself: What is the fastest way out there to find which index equals one across my entire column? I am talking about absolute blazing speed as this computation is performed hundreds of thousands of times. Any tricks or tips or memory shenanigans to do so? My matrix is not particularly large but it’s not trivially small either (eg. 500 rows, 2650 columns, double that).

Edit: wow I feel dumb, after posting this I saw the number of calls to i✶ and no wonder it spends so much time there, it gets called millions of times ):

What’s a good rule of thumb to know when to use @views? It’s still not very clear to me when or when not to use it. Thank you very much. Using the macro brought times down significantly so it’s very useful towards achieving my goal.

You should use @views whenever you take a slice of an array (i.e. any indexing that has a : in it), unless you actually want a copy. This is one of my least favorite parts about Julia but it is unfortunately pretty much impossible to fix.

Looking at the code, it would be strange if the the # compute the move part changes X. Therefore, the for loops basically look for all pairs of initial “1” elements in the columns. The way the pairs are found is searching for the same “1” many times over a column. Furthermore, there might be a problem of stability if a column does not contain a “1” (which may never occur for reasons specific to this case).

In any case, looking for the initial “1” once (like in dosomething2() is better, i.e.

julia> using Random
julia> Random.seed!(124)
julia> X = rand(10,10) .< 0.2;
julia> function dosomething(X)
s=0
S, B = size(X)
for j₁ in 1:B
ĩ = findfirst(==(1), @view X[:, j₁])
for j₂ in j₁+1:B
i✶ = findfirst(==(1), @view X[:, j₂])
s += ĩ * i✶
end
end
s
end
dosomething (generic function with 1 method)

julia> function dosomething2(X)
s=0
S, B = size(X)
js = [findfirst(==(1), @view X[:, i]) for i in 1:B]
for j₁ in 1:B
for j₂ in j₁+1:B
s += js[j₁]*js[j₂]
end
end
s
end
dosomething2 (generic function with 1 method)

Thank you for your input! However my current approach does mean that my X is changed during the iterations so precalculating which indices are where is out of the window.

I see. These mutating algorithms are tricky indeed. In any case, the no findfirst approach can still be used, if the first 1 are dynamically kept up to date (when assigning to X a 1 fixup the initial 1 vector. If X mutates violently, this cannot be done, but I’ve given my input