I would like to perform a multi-column search from a multi-dimensional matrix like the one below, is there a fast way to do this? (In the example below, I want to get the rows next to -5.0 3.0 2.0)
Thank you for your beautiful codes!
But , index_can = eachrow(Test_matrix) .== Ref([-5, 3, 2])
is slower than
result_index_can = (Test_matrix[:,[1]]. == -5). *(Test_matrix[:,[2]]. == 3). *(Test_matrix[:,[3]]. == 2).
Btw is your example wrong? You say you want rows 2, 7, 10, but row 10 has 5 and -3 instead of -5 and 3? All the methods discussed so far wouldn’t return row 10 of Test_matrix
On my machine for the 1e6 example I find this to be an order of magnitude faster:
julia> function f4(x)
res = Int64[]
for i ∈ 1:size(x, 1)
@inbounds if x[i, 1] == -5 && x[i, 2] == 3.0 && x[i, 3] == 2.0
push!(res, i)
end
end
res
end
f4 (generic function with 1 method)
julia> @btime f4($x);
1.694 ms (6 allocations: 21.86 KiB)
Note that f3 and f4 above return only the integer indices of the matching rows (i.e. roughly a length 1,000,000/11^3 ≈ 751 vector for the example above with 11 randomly chosen integers from -5 to 5), while versions 1, and 2 give you a size(x, 1)-lenght vector of booleans.
To get such a vector with the speed of f4 you can do:
julia> function f5(x)
res = fill(false, size(x, 1))
for i ∈ 1:size(x, 1)
@inbounds if x[i, 1] == -5 && x[i, 2] == 3.0 && x[i, 3] == 2.0
res[i] = true
end
end
res
end
f5 (generic function with 1 method)
julia> @btime f5($x);
1.686 ms (2 allocations: 976.67 KiB)
Thanks @nilshg, using @benchmark without interpolation seems to work most of the time (similar results to @btime with interpolation). It works alright for my two expressions but not for OP’s…
Yeah I’ve never quite understood when you’re okay not to interpolate, so I just always do it (I always thought the issue was access to global variables, which I would have expected to affect all three expressions similarly, but clearly not!)
I should also add that while I tend to always sprinkle @inbounds into performance sensitive loops, this doesn’t actually seem to have any effect here - the compiler might be smart enough to eliminate the boundscheck for this simple loop.
And finally if one of the actual experts stops by and proposes some sort of @tturbo solution that runs in 50ns or something, maybe they could also explain why fill(false, size(x, 1) is two allocations?
And here are some other simple options, using comprehensions, which are faster than your original code and only slightly slower than Nils’ solution above:
# outputs only matching integer indices:
[i for i in axes(x,1) if x[i,1]==-5 && x[i,2]==3 && x[i,3]==2]
# outputs all indices, as Vector{Bool} with 0 or 1:
[x[i,1] == -5 && x[i,2] == 3 && x[i,3] == 2 for i in axes(x,1)]