Matrix multiplication but sum of min() instead of sum of product

Hi everyone,

I’m new to Julia and I’m using it to do some network analysis on historical data. As I have valued bimodal data for my graphs, I need to “normalize" those values.

Basically, I need to do some “matrix multiplications” but instead of the sum of product, I need the sum of min()

Here is a small example :

# a Matrix m
m = [0 1 2
     3 4 5]

# and the transpose of m
mt = [0 3
      1 4
      2 5]

Instead of the sum of product m * mt

result = [5 14
         14 50]
# (0*0 + 1*1 + 2*2) (0*3 + 1*4 + 2*5)
# (3*0 + 4*1 + 5*2) (3*3 + 4*4 + 5*5)

I need the sum of the min()

result = [3 3
          3 12
# (min(0,0) + min(1, 1) + min(2, 2)) (min(0, 3) + min(1, 4) + min(2,5))
# (min(3,0) + min(4, 1) + min(5, 2)) (min(3, 3) + min(4, 4) + min(5, 5))

I tried something like

[min(m[i], mt[j]) for i in eachrow(m), j in eachcol(mt)]

but it doesn’t work and I think I’m missing something, certainly with the syntax…

Thank you in advance for your help, Best,
Josselin.

1 Like

Try this

result = zeros(size(m,1), size(mt,2))

for i in 1:size(m, 1)
    for j in 1:size(mt, 2)
        for k in 1:size(mt, 1)
           result[i,j] += min(m[i,k], mt[k,j])
        end
    end
end

one-liners! welcome @sardinecan!

[sum(mapslices(minimum,[i j],dims=2)) for i in eachrow(m),j in eachcol(mt)]

2×2 Matrix{Int64}:
 3   3
 3  12
2 Likes

This sounds similar to
https://github.com/TensorBFS/TropicalGEMM.jl
although not the same. LoopVectorization or Tullio could probably speed this use case up significantly.

2 Likes

@vkv, @viraltux it works perfectly ! Thank you very much !!

1 Like

@baggepinnen thank you for the package, I found the Tensor pkg for python but I missed this one !!

You can take the code from https://github.com/JuliaSIMD/LoopVectorization.jl/blob/master/docs/src/examples/matrix_multiplication.md

function A_mul_B!(C, A, B)
    @turbo for n ∈ indices((C,B), 2), m ∈ indices((C,A), 1)
        Cmn = zero(eltype(C))
        for k ∈ indices((A,B), (2,1))
            Cmn += A[m,k] * B[k,n]
        end
        C[m,n] = Cmn
    end
end

Just replace A[m,k] * B[k,n] with min(A[m,k], B[k,n]). It’s about as fast as a BLAS call. If you replace @turbo with @tturbo, you might get even further speedup.

Edit: I see that you are operating on really tiny matrices, so LoopVectorization may be overkill. Perhaps try StaticArrays, if you need performance.

1 Like

Here is a comparison of the results from LV and a normal implementation; LV was about 30x faster on my system. the threaded version was 100x better, but then it’s subjective.

my versioninfo() being

julia> versioninfo()
Julia Version 1.6.2
Commit 1b93d53fc4 (2021-07-14 15:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 12

julia> m = rand(100,1000);

julia> mt = rand(1000,100);

julia> result = zeros(100, 100);

julia> using BenchmarkTools, LoopVectorization
    
julia> function min_loops_lv!(m, mt, result)
       
          @turbo for i in 1:size(m ,1)
               for j in 1:size(mt, 2)
                   for k in 1:size(mt, 1)
                       result[i, j] += min(m[i,k], mt[k, j])
                   end
               end
           end
       end
min_loops_lv! (generic function with 1 method)

julia> function min_loops_tlv!(m, mt, result)
       
          @tturbo for i in 1:size(m ,1)
               for j in 1:size(mt, 2)
                   for k in 1:size(mt, 1)
                       result[i, j] += min(m[i,k], mt[k, j])
                   end
               end
           end
       end
min_loops_tlv! (generic function with 1 method)



julia> function min_loops_normal!(m, mt, result)
       
           for i in 1:size(m ,1)
               for j in 1:size(mt, 2)
                   for k in 1:size(mt, 1)
                       result[i, j] += min(m[i,k], mt[k, j])
                   end
               end
           end
       end
min_loops_normal! (generic function with 1 method)

julia> @benchmark min_loops_normal!($m, $mt, $result)
BenchmarkTools.Trial: 230 samples with 1 evaluation.
 Range (min … max):  21.075 ms …  23.152 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     21.735 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   21.784 ms ± 331.447 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

              ▃ ▄ ▇▄ ▆▄▆▃█▄▄▂▄▅                                 
  ▃▁▁▁▃▁▆▃▅▅▇▇█▆█▅█████████████▅▆▃▆▄▄▅▅▆▄▃▃▃▁▄▁▁▁▃▃▄▃▃▁▁▁▁▃▃▃▄ ▄
  21.1 ms         Histogram: frequency by time         22.9 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> result_lv = zeros(100, 100);

julia> @benchmark min_loops_lv!($m, $mt, $result_lv)
BenchmarkTools.Trial: 6836 samples with 1 evaluation.
 Range (min … max):  664.470 μs …  1.721 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     711.900 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   728.216 μs ± 59.243 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▅  █   ▂                                                 
  ▃▂▁▁▃█▂▂█▆▂▅█▄▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  664 μs          Histogram: frequency by time          970 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> result_tlv = zeros(100, 100);

julia> @benchmark min_loops_tlv!($m, $mt, $result_tlv)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  165.561 μs … 404.591 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     194.182 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   201.954 μs ±  29.925 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▆▆  █                                                        
  ▂▁██▂▆█▂▂▃▂▂▂▂▃▄▄▂▂▂▂▂▂▂▂▂▂▄▇▇█▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁ ▂
  166 μs           Histogram: frequency by time          278 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

3 Likes