It’s long been on my list of “problem with simple bounds” methods to implement, so I will take you up on this comment and ask for a review soon NLSolvers.jl already has something from that family (DFSANE) of algorithms depending on the GrippoLamparielloLucidistyle line search and spectral gradient “updates” so it’ll be a very welcome addition.
NLopt is certainly an option. I will try this and report. My option was very limited when I started with the Python implementation of my model. But today I discovered that NLopt also has a Python API.
I just upgraded my Julia from 1.5.4 to 1.6.1, after julia: init julia_10bin and julia_16bin by ninjin · Pull Request #123188 · NixOS/nixpkgs · GitHub was merged. To my surprise, I observed a ~2x speedup in general:
 Vanilla LBFGS using finite difference: 6.744 s → 3.188 s
 Single function evaluation: 27.658 μs → 13.088 μs
 LBFGS using finite difference, with same params as scipy.optimize except that it uses HagerZhang line search: 700.513 ms → 367.378 ms
 LBFGS using
autodiff = :forward
, with same params as scipy.optimize except that it uses HagerZhang line search: 485.330 ms → 309.725 ms  @PharmCat’s Newton method with sigmoid: 144.142 ms (on 1.6.1)
 SPGBox: 868.441 μs (on 1.6.1)
 scipy.optimize using PyCall: 14.093 ms → 10.520 ms (on 1.6.1)
As you noticed, I also benchmarked SPGBox (thank you @leandromartinez98 for building this package), and it only did 4 function evaluations, and reached out the desired result. The code is at climate_stress_test_benchmark/pharmcat_v2_g.jl at main · rht/climate_stress_test_benchmark · GitHub.
SPGBOX RESULT:
Convergence achieved.
Final objective function value = 6.263329329419884
Best solution found = [ 0.0, 1.0, 1.0, ..., 1.0]
Projected gradient norm = 0.0
Number of iterations = 3
Number of function evaluations = 4
I need to understand why SPGBox is effective and efficient, but I think I need to generate more test cases to check that SPGBox is robust across parameter variations.
I also want to understand why 1.6.1 is 2x faster than 1.5.4. I thought the main improvement focus is on the precompilation time?
Additionally, I also made a C++ version, and had observed that its single function evaluation (75 μs) is slower than the Julia version (13.088 μs). I’m not sure why. The C++ version can be found at climate_stress_test_benchmark/climate_stress_test_simplified.cpp at main · rht/climate_stress_test_benchmark · GitHub.
I would not expect any miraculous behaviour there. Except that I can guarantee that calls to spgbox are short in memory resources and can be completely allocation free if you preallocate the auxiliary vectors. Otherwise the method is good, but there is no fundamental reason for it be that better than all other alternatives.
@rht and others: since SPGBox raised some interest, I updated it now to follow the conventions concerning the solver call (functions come first) and upper and lower bounds are set by lower
and upper
arrays.
This will be in some minutes released as version 0.2.0
of the package, which will be a breaking change in the API. Thus, the tests must be updated. The example I posted above is already updated to follow the new interface.
I think this is indeed because you seem to have many variables at the boundary (let me actually ask, are they ever interior?).
For the simplified model I posted on my GitHub repo, the values are most of the time either 0 or 1, but this is a 1firm model. When the number of firms are 3 or higher, the transition from 0 and 1 becomes more gradual. Additionally, in a more intricate version, it is not always the case that the variable starts from 0 and ends with 1. We intend to do at least 50 firms, but can’t, as it took us 7 hours (parallelized on a 64 cores AWS instance) just to run 4 firms with a DeltaT of 30 years, using Python’s scipy.optimize.
One of the spg original developers just told me: “You could have told him that in principle the behavior of SPG and LBFGSB, should be very similar, something we have demonstrated in this paper. Of course for particular problems that has to be tested.”
Just reinforcing that we should not expect such a huge difference between the methods, in general.
I think the efficacy of the optimization algorithm of function when its arguments effect is like binary predictor or smooth can be very different. As you can see when you use a sigmoid link  it takes less iteration for line search (maybe why the function is more predictable in this case), so I want to say that maybe it will have more sense to make a comparison on data when behavior is near robust condition.
I will prepare a version of the model that doesn’t have binary optimum condition soon.
Update: I have migrated my model’s entire code to Julia using PyCall for scipy.optimize(method='LBFGSB')
. Though I encountered problem in using Threads.@threads
because Julia calling Python is not threadsafe.
I have also discovered GitHub  Gnimuc/LBFGSB.jl: Julia wrapper for LBFGSB Nonlinear Optimization Code, a Julia wrapper around the LBFGSB Fortran code. So finally we can do appletoapple comparisons with scipy.optimize
and SPGBox. In my new benchmark, I have also provided a grad function that is a finite difference provided by Optim.jl, i.e.
promote_objtype(x, f) = Optim.OnceDifferentiable(f, x, real(zero(eltype(x))))
d = promote_objtype(AA.xs0, fn)
function fd_g!(z, x)
z .= Optim.gradient!(d, x)
end
EDIT: this result is FLAWED. SPGBox mutated x0
into its optimized version, and so the subsequent @btime
of various methods finished prematurely because x0
is already optimized. The updated result is at Optim.jl vs scipy.optimize once again  #52 by rht.
New result summary:
 CG with Fminbox is the slowest.
 SPGBox and LBFGSB.jl are almost the same, for both ForwardDiff and finite difference gradients. This is consistent with the paper linked in Optim.jl vs scipy.optimize once again  #46 by leandromartinez98
 Finite difference is faster than ForwardDiff – how??
 SPGBox and LBFGSB.jl time with finite difference are almost the same as a single function evaluation – how?? I have made sure to run them in isolation, and they still return the intended result.

scipy.optimize(method='LBFGSB')
(in Julia using PyCall) is 122x slower than LBFGSB.jl – I think I need a separate benchmark using a Rosenbrock function just to be sure.
Log (note: the 6.2633… is just the minimum, just to show that they are consistent):
Benchmarking single function execution
13.615 μs (1 allocation: 160 bytes)
Benchmarking PharmCat v2 (CG) using grad specified by ForwardDiff
407.140 ms (18888 allocations: 3.41 MiB)
6.263329327214707
Benchmarking SPGBox using grad specified by ForwardDiff
880.418 μs (44 allocations: 9.30 KiB)
6.263329329419884
Benchmarking PharmCat v2 (CG) using finite difference grad
398.354 ms (34072 allocations: 4.52 MiB)
6.263329318909087
Benchmarking SPGBox using finite difference grad
15.633 μs (14 allocations: 2.22 KiB)
6.263329329419884
Benchmarking LBFGSB.jl using grad specified by ForwardDiff
884.683 μs (35 allocations: 7.80 KiB)
6.263329329419884
Benchmarking LBFGSB.jl using finite difference grad
18.015 μs (5 allocations: 784 bytes)
6.263329329419884
Benchmarking scipy.optimize(method=‘LBFGSB’) which uses finite difference grad
2.203 ms (1757 allocations: 84.33 KiB)
6.263329329419884
Code to reproduce: climate_stress_test_benchmark/pharmcat_v2_g.jl at 0c4b266b111218f826bd75fdd1938ab13eeb9806 · rht/climate_stress_test_benchmark · GitHub.
I forgot to mention that I have made sure that the parameters I specified for LBFGSB.jl match the default parameters used in scipy.optimize(method='LBFGSB')
.
Be careful in these benchmarks in not providing the solution as the initial point. At least in the case of SPGBox
, I know that it mutates the initial point given to the solution. Thus, to benchmark it properly, you need to reset it, with something like:
x0 = something
@btime spgbox!($fn, $g!, $lower, $upper, x) setup=(x=copy(x0)) evals=1
edit: now (v0.3.1) there is also a nonmutating version of the solver, which can be run with:
@btime spgbox($fn, $g!, $lower, $upper, $x0)
(without the !
 but for that it (deep)copies x
on input, thus for a nonallocating alternative use the first one with auxiliary array preallocation.
This is the benchmark result for Rosenbrock function, based on the README.md in GitHub  Gnimuc/LBFGSB.jl: Julia wrapper for LBFGSB Nonlinear Optimization Code.
I have made sure to run LBFGSB.jl first, and to ensure that x0
is a const
.
The result is that LBFGSB.jl is 67x faster.
Benchmarking single function execution
34.225 ns (0 allocations: 0 bytes)
Benchmarking LBFGSB.jl
444.786 μs (133 allocations: 8.28 KiB)
6.689900791974536e10
Benchmarking scipy.optimize(method='LBFGSB')
29.816 ms (30116 allocations: 1.33 MiB)
6.677651771226814e10
Code to reproduce: climate_stress_test_benchmark/rosenbrock.jl at 6920133a8f387bc3378c43b6f7d880f28552435a · rht/climate_stress_test_benchmark · GitHub.
Indeed. SPGBox
mutated the AA.xs0
! I have checked that LBFGSB.jl doesn’t mutate the x0
.
 ForwardDiff and finite difference have almost the same time, consistent across optimization methods
 SPGBox is 40% faster than LBFGSB.jl, even with
copy(xs0)
inside the@btime
loop (I usecopy(xs0)
so that the code is simpler)  LBFGSB.jl is about 2x faster than scipy.optimize(method=‘LBFGSB’)
Log:
Benchmarking single function execution
12.977 μs (1 allocation: 160 bytes)
Benchmarking PharmCat v2 (CG) using grad specified by ForwardDiff
383.874 ms (18888 allocations: 3.41 MiB)
6.263329327214707
Benchmarking SPGBox using grad specified by ForwardDiff
3.363 ms (138 allocations: 31.19 KiB)
6.263329329419884
Benchmarking PharmCat v2 (CG) using finite difference grad
388.767 ms (34072 allocations: 4.52 MiB)
6.26332930693809
Benchmarking SPGBox using finite difference grad
3.330 ms (270 allocations: 40.83 KiB)
6.263329329419884
Benchmarking LBFGSB.jl using grad specified by ForwardDiff
5.198 ms (203 allocations: 44.95 KiB)
6.263329329419884
Benchmarking LBFGSB.jl using finite difference grad
5.105 ms (401 allocations: 59.48 KiB)
6.263329329419884
Benchmarking scipy.optimize(method='LBFGSB') which uses finite difference grad
10.605 ms (9197 allocations: 447.61 KiB)
6.263329329419884
Fixed in [BUGFIX] Prevent SPGBox from mutating xs0 · rht/climate_stress_test_benchmark@caea4dd · GitHub.
const
does not mean what you think it means. It only means that the type can’t change and it’s a promise that your usercode doesn’t change the contents of mutable containers (it’s a “true” const for immutable structs, though that again doesn’t extend to mutable containers within that struct). You will still need the setup
part.
help?> const
search: const isconst MathConstants contains codeunits ncodeunits countlines count_ones
[...]
Note that "constantness" does not extend into mutable containers; only the association
between a variable and its value is constant. If x is an array or dictionary (for example)
you can still modify, add, or remove elements.
I see. I have additionally checked the x0
after a single optimization for both LBFGSB.jl and scipy.optimize, and in both x0
remains unchanged.
I was reading the thread and realized how small the academic world is. My first programming teacher in university was him. His class was so good.