Julia vs R vs Python: simple MLE problem

xiaodai · February 12, 2018, 1:16pm

This is the draft of my new post (https://www.codementor.io/zhuojiadai/draft/gnqi4njro) on Julia vs R vs Python on a simple MLE problem. Unfortunately, Julia comes out slightly worse due to long compilation times.

Welcome suggestions on things to try and improve.

NaOH · February 12, 2018, 1:40pm

The formulas are not displaying correctly for me:

Your other linked post returns 404 HTTP status code:

Long run probability of default (LRPD) estimation via truncated normal assumptions | Codementor

ChrisRackauckas · February 12, 2018, 1:48pm

17? That seems a little high. I just ran it 3 times:

               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.2 (2017-12-13 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-w64-mingw32

  7.220850 seconds (9.44 M allocations: 505.088 MiB, 2.15% gc time)
  0.000114 seconds (516 allocations: 23.469 KiB)
Julia has exited. Press Enter to start a new session.
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.2 (2017-12-13 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-w64-mingw32

  7.285036 seconds (9.44 M allocations: 505.088 MiB, 2.17% gc time)
  0.000107 seconds (516 allocations: 23.469 KiB)
Julia has exited. Press Enter to start a new session.
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.2 (2017-12-13 18:08 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-w64-mingw32

  7.057593 seconds (9.44 M allocations: 505.076 MiB, 3.66% gc time)
  0.000194 seconds (516 allocations: 23.469 KiB)

Also, the formulas aren’t displaying.

Tamas_Papp · February 12, 2018, 2:17pm

I wonder what message you are trying to convey with this post. Basically for any trivial problem, compilation time \gg runtime, so Julia is not going to do well. This is well known, but also pretty irrelevant, since for trivial problems, you can use any language you already know and it won’t matter.

Writeups for real-life projects would be much more informative, even if they are small. Eg anything that runs over a minute or, heaven forbid, 5 minutes; and has 200+ LOC.

In particular, Julia is very suitable for ML / MAP, since it has robust optimizers and automatic differentiation. But an interested user would not learn about this from your post.

ChrisRackauckas · February 12, 2018, 3:00pm

Is Optim.jl using ForwardDiff by default here?

Elrod · February 12, 2018, 3:33pm

Not if it used Nelder-Mead.
BFGS with forward mode AD would likely have been faster.

I’m also having the same problem as @NaOH.

jonathanBieler · February 12, 2018, 3:35pm

I think his point if that for very simple problems Julia is a bit annoying to use because of the compilation times, which seems like a fair assessment.

Only remark is that you maybe want to mention Optim.minimizer(mfit) to get the minimizer, which I think is the more robust interface.

Christopher_Fisher · February 12, 2018, 4:58pm

It might be worth mentioning Revise.jl. This package allows you to modify code without the need to restart your Julia session and recompile everything anew. It solves a slightly different problem, but it is an annoying one that I have encountered quite a bit during development.

ChrisRackauckas · February 12, 2018, 5:27pm

Yes, but the way it’s written isn’t fair because it makes it sound like it’s harmful in any case other than running the same code once or twice:

If you just wish to run the code once or twice then Julia’s long compilation may be something you want to avoid.

That’s false. For one thing, compilation-time is fixed. You don’t have to “solve a problem twice” to get side step compilation time if the calculation is long since at that point it won’t cause any noticable difference anyways (I’ve enountered that some people believe “the compilation run” is somehow slower: no, that’s not the case. All it does it compile first. There’s no type-instabilities in the first run or anything like that.)

Also, the problem is constrained to only the first call, even if a different optimization problem is solved. Here’s an example:

using Distributions, Optim
# hard coded data\observations
odr=[0.10,0.20,0.15,0.22,0.15,0.10,0.08,0.09,0.12]
Q_t = quantile.(Normal(0,1), odr)

# return a function that accepts `[mu, sigma]` as parameter
function neglik_tn(Q_t)
    maxx = maximum(Q_t)
    f(μσ) = -sum(logpdf.(Truncated(Normal(μσ[1],μσ[2]), -Inf, maxx), Q_t))
    f
end

neglikfn = neglik_tn(Q_t)

# optimize!
# start searching
@time res = optimize(neglikfn, [mean(Q_t), std(Q_t)]) # 7 seconds
@time res = optimize(neglikfn, [mean(Q_t), std(Q_t)]) # 0.0001 seconds

function neglik_tn2(Q_t)
    maxx = maximum(Q_t)
    f(μσ) = -sum(logpdf.(Truncated(Normal(μσ[1],μσ[2]), -Inf, maxx), Q_t))
    f
end
neglikfn2 = neglik_tn2(Q_t)
@time res = optimize(neglikfn2, [mean(Q_t), std(Q_t)]) # 0.06 seconds

A more accurate statement is:

If you wish to solve only a single optimization problem and that problem takes <10 seconds, then Julia’s long initial compilation is something you want to avoid. For long enough problems, or for solving multiple optimization problems, the compilation time is not noticeable.

Maybe explain via some examples. I would write it as:

If you want to solve a problem that takes 3000 seconds to solve, the first time in a Julia session will make it take 3007 seconds making the total runtime more relevant to the total time than compilation. If you want to solve many 100 10 second optimization problems, the first will take 17 seconds, and subsequent calls will not have the compilation and will take roughly 10 seconds, making the total runtime 1007 seconds and thus the individual problem each is more relevant than the compilation time. If you want to solve a single 5 second optimization problem, it’ll be 12 seconds. Thus the compilation time issue can be annoying for interactive use when you want to solve a single problem but doesn’t effect performance-sensitive uses.

Julia v0.7 does have an interpreter though which promises to reduce compilation time in exchange for less optimizations and this may be an interesting middle ground for the specific interactive use case. Additionally, tools like PackageCompiler.jl may become more common, limiting the “startup time” of package code that is noticed here.

So there is an issue, but IMO when you explain exactly what that issue is it becomes apparent that it’s due to how it’s being used, and whether that would be an issue for you is something for the reader to decide.

jonathanBieler · February 12, 2018, 5:52pm

The “the code” in “If you just wish to run the code once or twice” clearly refers to “such a simple optimization problem”, not to any piece of code, but yes the issue could maybe explained a bit more in detail so people don’t get confused.

mcreel · February 12, 2018, 6:01pm

I have a function that does MLE for a user specified likelihood function. One of the methods runs a simple example, which serves as documentation. In my .juliarc file, I call this simple example, so that ML estimation is warmed up an ready to use without compilation when I need it (except for the user-specified likelihood function).

The example method:

function mleresults()
    println("execute edit(mleresults,()) to examine the example code")
    x = randn(100,3)
    β = rand(3)
    println("true betas: ")
    prettyprint(β)
    y = zeros(100)
    for i = 1:100
        y[i] = rand(Poisson(exp.(x[i,:]'β)))
    end    
    model = β -> poisson(β, y, x)
    mleresults(model, β, "simple ML example");
end

the results:

julia> @time mleresults();
execute edit(mleresults,()) to examine the example code
true betas: 
     0.27935
     0.55297
     0.87107
************************************************************
simple ML example
MLE Estimation Results
BFGS convergence: Normal convergence
Average Log-L: -1.40224
Observations: 100

                estimate     st. err      t-stat     p-value
           1     0.36406     0.06912     5.26741     0.00000
           2     0.65008     0.06001    10.83261     0.00000
           3     0.87825     0.06666    13.17476     0.00000

Information Criteria
                   Crit.      Crit/n
       CAIC    297.26347     2.97263
        BIC    294.26347     2.94263
        AIC    286.44796     2.86448
************************************************************
  0.001996 seconds (1.67 k allocations: 490.592 KiB)

julia>

Notice that the time to run this is essentially zero! This is in https://github.com/mcreel/Econometrics.jl, by the way.

ChrisRackauckas · February 12, 2018, 6:06pm

Sounds like you should just add that to your sysimage

mcreel · February 12, 2018, 6:12pm

I tried to do that about half a year ago or so, and didn’t quite figure it out. Maybe once 1.0 arrives I’ll make the effort to learn how to do it. But putting commonly used stuff in .juliarc with toy usages is a convenient and easy way to have things ready for interactive use.

xiaodai · February 12, 2018, 9:23pm

I couldn’t have put it better myself, so I’ve quoted you wholesale.

xiaodai · February 12, 2018, 9:26pm

On my comptuer it’s down to 7.5 seconds now; perhaps I had something in the background. On the work computer it is still 12 seconds though.

RoyiAvital · February 13, 2018, 7:28am

You really need to fix the MathJax issues there (The LaTeX isn’t rendered well) before one could really read into your writing.

ChrisRackauckas · February 13, 2018, 7:54am

What are the timings for each?

xiaodai · February 13, 2018, 12:34pm

It’s fixed now. I am pretty sure it was just bug on the codementor.io platform where they don’t display drafts properly.

pkofod · February 13, 2018, 6:35pm

Compile times. I don’t really think we (Optim.jl) can do so much about that currently. If we can, I’d be happy to help make it faster.

The real kicker is the following: you favorite computer scientist comes up with a new cool way to use arbitrary precision really really fast. What framework is most likely to work with this new number type (julia already has several implemented)? Nelder Mead in R, Python or Julia? And if none of them supports it, who will have the easiest time patching their package?

Now I know that a lot of people use what people might know as double precision numbers, but the flexibility of Julia is really not present in either R or Python to the extent that I know those languages/frameworks.

Let me also frame the compile time in another light. How long time did you spend from thinking about writing the blog post until you actually pressed enter on “optimize(…)” ? A bit longer than 10 seconds I guess. It is of course a problem if you’re running 1000 different optimization problems, say for variations of a model, but then just don’t run your “loop over models” outside of Julia. Instead of running 1000 scripts, run one script that runs 1000 models.

Edit:

I hope I didn’t come off as too defensive btw. I really do appreciate you spreading the word about Optim. I can survive the compile times, but truth be told we could always use more eyes and more feedback!

xiaodai · February 13, 2018, 7:48pm

I have been thinking about whether I would prefer to use Julia or R in this instance. For work, I have written code to solve a much more complicated version of the problem in R. That’s before I started using Julia seriously.

Now I think about it, I would have preferred Julia over R for that problem. I might add that to the post once I find time to finish the Fannie Mae data analysis

Topic		Replies	Views
Another post on package compilation time Performance compilation , package-compiler	14	2490	June 29, 2020
Excessively long TTFX with Optimization and OptimizationOptimJL Optimization (Mathematical) optimization	22	383	August 24, 2024
Choosing a numerical programming language for economic research: Julia, Community blog , blog-post	135	6309	May 30, 2023
A Comparison of Programming Languages in Economics Community	62	6214	August 7, 2018
Extremely slow execution time New to Julia	45	19550	January 18, 2021

Julia vs R vs Python: simple MLE problem

Related topics