For me (on a MacBook Pro), the original code takes
28.235802 seconds (12.17 M allocations: 23.930 GiB, 7.51% gc time, 11.78% compilation time)
23.435813 seconds (1.57 M allocations: 23.396 GiB, 9.50% gc time)
Improving the kernel (avoid allocations and using @vectorize from LoopVectorization.jl), i.e.
using LoopVectorization
function neg_log_lik_optim(init_w)
s = 0.0
@vectorize for i in 1:length(winner_ids)
a = winner_ids[i]
b = loser_ids[i]
x = init_w[a] - init_w[b]
s += log(1/(1+exp(-x)))
end
return s
end
@time opm = optimize(neg_log_lik_optim, init_w, BFGS())
I get
19.492552 seconds (14.23 M allocations: 814.709 MiB, 1.25% gc time, 51.31% compilation time)
9.360662 seconds (1.24 k allocations: 5.054 MiB)
(I don’t know any R so I can’t yet compare to it.)
Just in case the question is not, how to make this approach faster but how to solve it: This is logistic regression and one can find the optimum with our GLM package.
I believe this time difference is caused simply by different default iteration limits in R and Julia. That is, neither of them reaches a convergence criterion, and both only stop when they reach a pre-specified number of function calls.
On an side note, it’s a nice little puzzle to compute this function without running into accuracy problems and problems from overflow/underflow. Consider
You can just run it twice like this. And notice that the values didn’t change after u run it again. It converges, but the constant is not uniquely pre-determined that’s all. Anyway, it’s easy to fix, just set one of the player’s strength to 0. It still converges either way whether u can someone to 0 or not…
When did this change occur? I can’t find anything in the Loopvectorization.jl readme or manual about the new name, nor any issue or pr pertaining to it.