Sum(log.(p * C)) is 2 to 4 times slower than in NumPy

mcabbott · January 26, 2021, 8:59am

What makes a big difference is to keep your row vector a row vector type, not a Matrix:

julia> @btime sum(log.($p * $C));
  717.158 μs (4 allocations: 86.16 KiB)

julia> pa = collect(range(0, 1000, length=Np))' # type Adjoint, not 1xN Matrix
1×200 adjoint(::Vector{Float64}) with eltype Float64:
 0.0  5.02513  10.0503  15.0754  20.1005  25.1256  …  979.899  984.925  989.95  994.975  1000.0

julia> @btime sum(log.($pa * $C));
  239.876 μs (4 allocations: 86.16 KiB)

julia> @btime sum(log, $pa * $C);
  234.405 μs (2 allocations: 43.08 KiB)

julia> BLAS.vendor()
:openblas64

Same computer:

Python with Nx=5500, Np=200
236 µs ± 8.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Using sum(log, x) saves allocating another array for the result, but is a small effect here. Using MKL.jl might also help, instead of the default OpenBLAS library.

Further transposing the data in C, which if I’m thinking correctly ought to bring us closer to what Numpy is actually doing, gives:

julia> Ct = transpose(permutedims(C)); # same values, row major

julia> Ct == C
true

julia> @btime sum(log, $pa * $Ct);
  210.394 μs (2 allocations: 43.08 KiB)

Topic		Replies	Views
Performance of log.(x) General Usage performance , broadcast	11	2684	April 12, 2019
Sum of logs Performance	19	1035	May 10, 2022
Performance of logarithm calculation (when not to use the dot operator?) Performance	3	1208	April 24, 2018
How to speed up the calculation of Logit likelihood? Performance question	8	471	September 23, 2023
Is this function well optimized for speed? General Usage	13	906	November 13, 2019

Sum(log.(p * C)) is 2 to 4 times slower than in NumPy

Related topics