I wrote some code that searches a large matrix to find occurrences of a given vector. MWE:
A = repeat([1 2 3; 4 5 6; 7 8 9], 100, 1);
b = [4; 5; 6];
In this case, what I would want is a boolean vector that is true for every 2+3*i term.
I am comparing two possible solutions. The first is by using array comprehension:
@time ids = Bool[ b == A[j,:] for j=1:size(A,1)]
0.000403 seconds (903 allocations: 47.312 KiB)
Long story short, there’s almost a factor of 10 between the two methods. Why is that the case?
Also, why is the first method yielding a Boolean array while the second yields a BitArray? I would like to use the latter method for speed, but the BitArray is proving harder to work with in the remainder of the code.
Each slice, A[j, :], creates a new temporary array. Use views instead:
foo(A, b) = [b == A[j, :] for j in axes(A,1)] # no need for Bool in front of vector
bar(A, b) = [b == @view A[j, :] for j in axes(A,1)]
baz(A, b) = all(b'.==A; dims=2)
The @time macro isn’t generally suitable for micro benchmarks, use BenchmarkTools instead:
Thank you very much, @view seems to be the best choice in this case!
And yeah, sorry I was reporting @time results, I know it shouldn’t be used to asses code performance.