Newbie Question: Julia optimization multiprocessing

Jason · January 25, 2017, 6:38pm

Hi, I’m trying to figure out whether Julia is a good language for a project I’m undertaking. One thing I’ll need to do is optimize a complex statistical function (Hidden Markov / CRF related) involving potentially 12K parameters. I’m thinking that L-BFGS would be a good optimization method.

I’m wondering whether Julia’s implementation of this is set up to do optimization with multiple cores and perhaps distributed over multiple computers? I imagine that it shouldn’t be hard to run several separate optimizations simultaneously (tasks?)? Any idea if optimization speeds in Julia approach that of C? (Then again, I hear setting up multi-processing in C is quite involved. Am trying to decide between Java/Scala, C, and Julia; am somewhat familiar with Java).

Jason

johnmyleswhite · January 25, 2017, 7:41pm

The L-BFGS implementation in Optim.jl is not parallelized but that doesn’t seem like it would be important for your use case since it’s usually the objective function and its derivatives that needs to be sped up, not the L-BFGS internal steps. So I’d guess the performance questions depend on whether you feel confident in your ability to implement those functions as efficiently as you would in C.

Jason · January 25, 2017, 11:05pm

Thanks for the reply!

I’m currently running L-BFGS in Stan, which compiles as C++ and also runs unparallelized. The HMM code in Stan uses fairly standard efficient algorithms. If there is a closed-form derivative for this, I imagine it would be quite ugly.

Given what I’m seeing in Stan on artificial data with relatively few parameters, I’m guessing it could take days to optimize the parameters with a sufficient quantity of real data and a realistic number of hidden states and emissions. Thus my interest in parallelization.

Such analysis would, however, require several random starts to help reduce the chance that I’m at a local optimum (or to test alternative values for the number of hidden states), so I guess I can save time by simply running separate optimizations in parallel (which is possible?)–if this doesn’t require too much memory. There would be advantages to being able to run one analysis in parallel, but probably not critical ones.

Incidentally, is it straightforward to get an estimated Hessian for the L-BFGS optimized parameters in Julia? I didn’t notice an option for this.

Thanks again, Jason

johnmyleswhite · January 25, 2017, 11:29pm

You can usually get exact Hessians using the ForwardDiff library and calling the hessian function. That’s what I do to estimate observed Fisher information in my work.

Jason · January 26, 2017, 4:53am

Incidentally, I found a few potential solutions for obtaining nicely
parallelized l-bfgs code from within Julia.

One might be to use Julia’s ability to call Python (which looks really
good) and then have Python run Apache Spark, which has well parallelized
L-BFGS.

Another possibility is to utilize the TAO component of the C++ package
PETSc, which can utilize GPUs, multiple cores, and multiple machines.
Would likely have to write a C wrapper though and hope there are no hiccups
with data transfer.

And, of course, there’s the possibility of adding some multi-core
parallelization to Julia itself.

ChrisRackauckas · January 26, 2017, 5:45am

Julia already has it. See the docs.

ChrisRackauckas · January 26, 2017, 5:49am

See PETSc.jl

ChrisRackauckas · January 26, 2017, 5:53am

Most likely, as @johnmyleswhite noted, the speedups will come not from parallelizing L-BFGS itself, but parallelizing either your objective function calculation, or giving it faster (parallel) Jacobian or Hessian functions. Such derivatives can be computed very easily using ForwardDiff, and ForwardDiff has multithreading built in (though not multiprocessing: you’d have to set that up using Julia’s parallelism features). Use ProfileView.jl to find out where the actual bottleneck is first, otherwise it’s hard to know what the actual problem is.

Jason · January 26, 2017, 6:34am

And apparently there’s a spark.jl as well–perfect.

Jason · January 27, 2017, 1:11am

Thanks Chris, this is all very helpful!

bashonubuntu · October 30, 2019, 1:12am

Hello,

I was searching for a thread like this and thought I might ask this here as it seems most relevant to the above discussion. Is it possible to pass a parallel function i.e. a function being evaulated using pmap or @distributed to ForwardDiff.hessian()? For example, would it make sense to implement

ForwardDiff.hessian(x → my_parallel_function(x), evaluate_hessian_at, cfg, Val{false}())

where cfg = ForwardDiff.HessianConfig(x → my_parallel_function(x), evaluate_hessian_at, Chunk{n}()); n <= dim(x)?

Topic		Replies	Views
Stochastic (L-)BFGS Optimization (Mathematical) question	1	345	January 18, 2023
Performance of Optim.jl vs. Matlab's fminunc Optimization (Mathematical) performance	12	3992	August 6, 2018
Parallel newton-raphson optim Optimization (Mathematical) parallel , optim	14	2941	August 7, 2017
"Very optimised" Julia code 50x slower than Python with compiled functions General Usage	3	599	March 28, 2023
Optimize performance comparison - Optim.jl vs Scipy Performance	15	5753	February 12, 2020

Newbie Question: Julia optimization multiprocessing

Related topics