Why is this optimization problem so slow in Julia? In R it took 60s

xiaodai · May 21, 2021, 5:48am

Update
I copied the data and the code into a repo GitHub - xiaodaigh/optim-slow-julia

I am trying to estimate the strength of Go player ( or any other two player broad game where the only outcomes are win or loser).

My data is quite simple. It’s a data frame column the Integer ID of the winner and loser of each game. Like this

using DataFrames
using CSV

data = CSV.read("goplayers.csv", DataFrame)

const P1=Int.(data.p11)
const P2=Int.(data.p22)

function loglik1(w1, w2)
    x = w1-w2
    log(1/(1 + exp(-x)))
end

init_w = rand(length(unique(vcat(P1, P2))))

function neg_log_lik(weights)
    -mapreduce(((p1, p2),)->loglik1(weights[p1], weights[p2]), +, zip(P1, P2))
end

using RCall

@rput P1 P2

# finishes in about 60s
@time R"""
w = runif(max(c(P1, P2)))
system.time(m <- optim(w, function(w) {
  x  = w[P1] - w[P2]
  p = 1/(1 + exp(-x))
  -sum(log(p))
}, method="BFGS"))

ping = m$par
""";

@rget ping

using Optim: optimize, BFGS
# take forever
@time opm = optimize(neg_log_lik, init_w, BFGS())

It took a lot of time. But the same task too only 60s in R. So I am wondering what I am doing wrong here.

For this random example, it’s quite ok but for the specific data that I have it takes a long time.

johnmyleswhite · May 21, 2021, 5:51am

Can you show the R code? It seems hard to compare without seeing them side-by-side.

antoine-levitt · May 21, 2021, 6:05am

Pass an explicit gradient, or it’ll do finite differences
Use LBFGS, not BFGS

xiaodai · May 21, 2021, 7:26am

where a is the dataframe

system.time(m <- optim(w, function(w) {
  x  = w[a$p11] - w[a$p22]
  p = 1/(1 + exp(-x))
  -sum(log(p))
}, method="BFGS"))

xiaodai · May 21, 2021, 7:26am

But I do the same in R? Not sure why. It seems hard to reproduce without the data as Julia seems to be ok with random data.

carstenbauer · May 21, 2021, 9:00am

> system.time(m <- optim(w, function(w) {
  x  = w[a$p11] - w[a$p22]
  p = 1/(1 + exp(-x))
  -sum(log(p))
}, method="BFGS"))
Error in optim(w, function(w) { : object 'w' not found
Timing stopped at: 0 0 0.001

Can you send a working code snippet so that it’s easier to compare?

xiaodai · May 21, 2021, 9:10am

Ok. U just need to initialize the w

I am outside atm. Let me make a new in one script with dat

carstenbauer · May 21, 2021, 9:39am

For me (on a MacBook Pro), the original code takes

 28.235802 seconds (12.17 M allocations: 23.930 GiB, 7.51% gc time, 11.78% compilation time)

 23.435813 seconds (1.57 M allocations: 23.396 GiB, 9.50% gc time)

Improving the kernel (avoid allocations and using @vectorize from LoopVectorization.jl), i.e.

using LoopVectorization

function neg_log_lik_optim(init_w)
    s = 0.0
    @vectorize for i in 1:length(winner_ids)
        a = winner_ids[i]
        b = loser_ids[i]
        x = init_w[a] - init_w[b]
        s += log(1/(1+exp(-x)))
    end
    return s
end

@time opm = optimize(neg_log_lik_optim, init_w, BFGS())

I get

19.492552 seconds (14.23 M allocations: 814.709 MiB, 1.25% gc time, 51.31% compilation time)

  9.360662 seconds (1.24 k allocations: 5.054 MiB)

(I don’t know any R so I can’t yet compare to it.)

julia> versioninfo()
Julia Version 1.6.0
Commit f9720dc2eb (2021-03-24 12:55 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.6.0)
  CPU: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 6

(jl_DHogdh) pkg> st
      Status `/private/var/folders/1d/sbf5s24x6y306tgv_4hh06y00000gn/T/jl_DHogdh/Project.toml`
  [a93c6f00] DataFrames v1.1.1
  [bdcacae8] LoopVectorization v0.12.21
  [429524aa] Optim v1.3.0

xiaodai · May 21, 2021, 10:56am

I have put the code and the data into a repo GitHub - xiaodaigh/optim-slow-julia

I can still replicate the issue.

R finishes in about 60s but Julia takes forever. Doesn’t seem to be terminating.

Note that generating the data randomly works doesn’t generate the issue. So seems to be something specific with the data.

xiaodai · May 21, 2021, 10:57am

return -s

mschauer · May 21, 2021, 11:00am

Just in case the question is not, how to make this approach faster but how to solve it: This is logistic regression and one can find the optimum with our GLM package.

xiaodai · May 21, 2021, 11:05am

is it? I just happened to use the logistic transform though.

I need x = strength1-strength2
and that logit(x) = probability that player 1 wins.

How can you formulate that as a GLM?

aplavin · May 21, 2021, 11:20am

I believe this time difference is caused simply by different default iteration limits in R and Julia. That is, neither of them reaches a convergence criterion, and both only stop when they reach a pre-specified number of function calls.

mschauer · May 21, 2021, 11:29am

Yeah, note that the objective is invariant under addition of a constant to everybody, so convergence is an issue.

xiaodai · May 21, 2021, 11:32am

it will still converge though

stevengj · May 21, 2021, 12:49pm

On an side note, it’s a nice little puzzle to compute this function without running into accuracy problems and problems from overflow/underflow. Consider

julia> f(x) = log(1/(1+exp(-x)))

julia> f(100), Float64(f(big"100.0")) # spurious underflow: no correct digits
(0.0, -3.720075976020836e-44)

julia> f(-1000), Float64(f(big"-1000.0")) # spurious overflow
(-Inf, -1000.0)

julia> f(20), Float64(f(big"20.0"))  # only 8 correct digits
(-2.0611536942919273e-9, -2.061153620314381e-9)

One alternative is:

g(x) = x > 0 ? -log1p(exp(-x)) : x - log1p(exp(x))

pkofod · May 21, 2021, 1:06pm

Do you mean it converges according to your favorite criteria or that it actually stopped? Try turning on the trace.

xiaodai · May 21, 2021, 1:15pm

I dont need to

@time opm = optimize(neg_log_lik, init_w, BFGS())

ping = opm.minimizer

@time opm = optimize(neg_log_lik, opm.minimizer, BFGS())

You can just run it twice like this. And notice that the values didn’t change after u run it again. It converges, but the constant is not uniquely pre-determined that’s all. Anyway, it’s easy to fix, just set one of the player’s strength to 0. It still converges either way whether u can someone to 0 or not…

DNF · May 21, 2021, 1:19pm

When did this change occur? I can’t find anything in the Loopvectorization.jl readme or manual about the new name, nor any issue or pr pertaining to it.

carstenbauer · May 21, 2021, 1:21pm

I followed a discussion on Zulip about it. Maybe @avx is still working though.

Topic		Replies	Views
Parameter estimation of an ODE in Julia is slower than in R Performance optim , r , differentialequation , parameters	25	1711	August 7, 2022
Julia vs R vs Python: simple MLE problem Optimization (Mathematical)	43	5207	April 5, 2019
Optim.jl vs scipy.optimize once again Performance optim	54	8913	June 10, 2021
Suggestions needed: speed up optimization Performance	31	2026	October 4, 2020
Performance comparison - Flux.jl's Adam vs Jax's Adam Performance question , package , performance , flux	33	3891	November 3, 2022

Why is this optimization problem so slow in Julia? In R it took 60s

Related topics