Efficiency of matrix data operations

Assuming that there are always two rows that match, no more and no less, and that size(a, 2) = 3 and size(b, 2) = 4, then I would replace the methods and setdiff() by hard coded ones. You want to do this because if you do some “easy-profiling” like writing @time in front of the if a[i,:] ⊆ b[j,:] loops and in front of k1 = setdiff(b[c[i,4],:], a[i,:]) you will see that those are the lines where the script spends most of its time and almost all allocations. I also took the transpose the matrices a and b so that I’m accessing their elements column-wise (see the Performance Tips). @inbounds also helps.

My code is:

function foo2(a, b)
    c = zeros(Int, 4, size(a, 2))    
    @inbounds for i=1:size(a, 2)
        tmp = 1
        for j=1:size(b, 2)
            tmp > 2 && continue
            is_1 = a[1, i] == b[1, j] || a[2, i] == b[1, j] || a[3, i] == b[1, j]
            is_2 = a[1, i] == b[2, j] || a[2, i] == b[2, j] || a[3, i] == b[2, j]
            is_3 = a[1, i] == b[3, j] || a[2, i] == b[3, j] || a[3, i] == b[3, j]
            is_4 = a[1, i] == b[4, j] || a[2, i] == b[4, j] || a[3, i] == b[4, j]
            # This line could potentially save time with larger matrix sizes.
            #!is_1 & !is_2 & !is_3 & !is_4 && continue
            is_123 = is_1 & is_2 & is_3
            is_124 = is_1 & is_2 & is_4
            is_134 = is_1 & is_3 & is_4
            is_234 = is_2 & is_3 & is_4
            if is_123
                c[tmp, i] = j
                c[tmp+2, i] = b[4, j]
                tmp += 1
            elseif is_124
                c[tmp, i] = j
                c[tmp+2, i] = b[3, j]
                tmp += 1
            elseif is_134
                c[tmp, i] = j
                c[tmp+2, i] = b[2, j]
                tmp += 1
            elseif is_234
                c[tmp, i] = j
                c[tmp+2, i] = b[1, j]
                tmp += 1
            end
        end
    end
    return c
end

a = [1 2 3; 
     4 5 6;
     7 8 9];

b = [1 2 3 5;
     4 5 8 6;
     7 0 9 8;
     2 3 1 6;
     4 5 6 9;
     8 9 7 1];

aT = Matrix(transpose(a))
bT = Matrix(transpose(b))

Results:

>>> @btime foo($a, $b);
6.563 μs (103 allocations: 9.55 KiB)

>>> @btime foo2($a, $b);
173.424 ns (1 allocation: 176 bytes)

Nearly \times 40 faster. If you have problems with larger matrices then I’d try with multithreading in the outermost loop.

In any case, my script is not very elegant and is hard-coded for the input you gave, but maybe some of the above ideas help you.

1 Like