Performance of log.(x)

pmags · April 11, 2019, 2:26am

I’m a newcomer to Julia from Python and was surprised to find vectorized log so slow in Julia. Am I doing something wrong in the code below?

As noted here numpy.log seems to be about 6x faster than Base.log broadcast to an array.

function test_log()
    x = randn(10000) .+ 5
    @benchmark log.($x)
end

@benchmark test_log() yields:

BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     49.861 μs (0.00% GC)
  median time:      80.124 μs (0.00% GC)
  mean time:        97.785 μs (12.32% GC)
  maximum time:     43.889 ms (99.65% GC)
  --------------
  samples:          10000
  evals/sample:     1

As compared to:

x = np.random.randn(10000) + 5
%timeit np.log(x)

Which yields:

15 µs ± 116 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

rdeits · April 11, 2019, 2:55am

Welcome!

My guess would be that you’re getting SIMD operations in numpy but not in Julia. Interestingly, I can’t reproduce your timing numbers; instead, my Julia timing roughly matches yours, but my numpy results are about 10X slower:

julia> x = randn(10000) .+ 5;

julia> @benchmark log.($x)
BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     57.371 μs (0.00% GC)
  median time:      59.739 μs (0.00% GC)
  mean time:        67.735 μs (7.25% GC)
  maximum time:     35.763 ms (99.67% GC)
  --------------
  samples:          10000
  evals/sample:     1

In [3]: x = np.random.randn(10000) + 5

In [4]: %timeit np.log(x)
173 µs ± 5.53 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

There was a related discussion of SIMD optimizations for logs here: Performance of logarithm calculation (when not to use the dot operator?) - #4 by kristoffer.carlsson although I’m afraid I don’t know enough about the details to be sure if it’s still relevant.

You also might want to consider making a separate thread for this discussion, as it’s only tangentially related to the topic of the original post (which was simply a question of syntax, not of performance optimization).

baggepinnen · April 11, 2019, 4:28am

We explored different ways of making exp faster in this thread
https://discourse.julialang.org/t/fast-logsumexp/
The same tricks would apply to log

pmags · April 11, 2019, 1:32pm

@rdedit – Thanks for comparing your results. I’m guessing that the differences in our NumPy timings may relate to the fact that the NumPy I’m using is linked against MKL (this is the default for the Anaconda distributed versions of the NumPy/SciPy toolchain)?

pmags · April 11, 2019, 1:40pm

@baggepinnen – Thanks for the link to the related discussion. I gave Yeppp a shot but unfortunately Yeppp.log is about twice as slow on my platform/CPU as Base.log

julia> using Yeppp
julia> x = randn(10000 ) .+ 5;
julia> @benchmark Yeppp.log($x)
BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     133.556 μs (0.00% GC)
  median time:      174.349 μs (0.00% GC)
  mean time:        193.267 μs (4.51% GC)
  maximum time:     3.532 ms (92.44% GC)
  --------------
  samples:          10000
  evals/sample:     1

For reference, here’s my Julia and system info:

julia> versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

RoyiAvital · April 11, 2019, 7:06pm

You can’t (Usually) beat Intel MKL on those kind of operations.

If it is already available, try Julia Pro Distribution in MKL Flavor.

kristoffer.carlsson · April 11, 2019, 7:49pm

I don’t think there is any support for different special function in JuliaPro.

RoyiAvital · April 11, 2019, 8:14pm

What do you mean?

Do you mean that even if MKL is integrated Julia won’t use it for this?

I’m really puzzled why, when MKL is integrated, it is not used fully.
For instance, what about MKL Sparse capabilities?

kristoffer.carlsson · April 11, 2019, 8:33pm

When built with MKL, Julia will use MKL’s BLAS, nothing else.

No, there is GitHub - JuliaSparse/MKLSparse.jl: Make available to Julia the sparse functionality in MKL and GitHub - JuliaSparse/Pardiso.jl: Calling the PARDISO library from Julia.

improbable22 · April 11, 2019, 9:07pm

If I understand right log.(x) will never be farmed out to MKL, but something like https://github.com/rprechelt/Vectorize.jl would do this.

pmags · April 12, 2019, 1:50am

For those interested in this topic, please also see the parallel discussion here: https://github.com/JuliaLang/julia/issues/8869

RoyiAvital · April 12, 2019, 6:00pm

This is really nice progress.
I really like to see better and better integration with MKL.
Even if it is on side packages.

Do you have donation box for your MKL efforts?

Topic		Replies	Views
Sum(log.(p * C)) is 2 to 4 times slower than in NumPy Performance python , matrices	19	1409	January 26, 2021
Is this function well optimized for speed? General Usage	13	963	November 13, 2019
Performance of logarithm calculation (when not to use the dot operator?) Performance	3	1227	April 24, 2018
Dot function General Usage	45	5688	September 26, 2018
Comparing exp() performance on Julia versus numpy Performance	38	2060	August 8, 2022

Performance of log.(x)

Related topics