Latest recommendations for global optimization

baggepinnen · December 18, 2019, 11:49am

What’s the run length measured in? Function evaluations or wall clock time?

jonathanBieler · December 18, 2019, 12:13pm

Number of function evaluations. In the original benchmark they go up to 1e7, but this takes already took like 20 minutes to run.

dpsanders · December 18, 2019, 2:03pm

That’s very interesting. Could you please share the benchmarking code?

dpsanders · December 18, 2019, 2:04pm

Did you use the fastpowers branch of IntervalArithmetic.jl? That should give a huge speedup.

jonathanBieler · December 18, 2019, 2:42pm

My script is here (need master, my PR of IntervalOptimisation and a couple of other packages, like BlackBoxOptim.jl).

I just tried the fastpowers branch but I’m getting this :

julia> minf, minx = minimise( x->sum(abs2.(x)), IntervalBox(-5.5..5.5, 3) )
ERROR: MethodError: no method matching +(::Interval{Float64}, ::RoundingMode{:Down})
Closest candidates are:
  +(::Any, ::Any, ::Any, ::Any...) at operators.jl:502
  +(::PyObject, ::Any) at /Users/jbieler/.julia/packages/PyCall/ttONZ/src/pyoperators.jl:13
  +(::Interval) at /Users/jbieler/.julia/packages/IntervalArithmetic/19YeJ/src/intervals/arithmetic.jl:65
  ...
Stacktrace:
 [1] +(::Interval{Float64}, ::Interval{Float64}, ::RoundingMode{:Down}) at ./operators.jl:502
 [2] +(::Interval{Interval{Float64}}, ::Interval{Interval{Float64}}) at /Users/jbieler/.julia/packages/IntervalArithmetic/19YeJ/src/intervals/arithmetic.jl:85
 [3] macro expansion at /Users/jbieler/.julia/packages/StaticArrays/1g9bq/src/mapreduce.jl:103 [inlined]
 [4] _mapreduce at /Users/jbieler/.julia/packages/StaticArrays/1g9bq/src/mapreduce.jl:95 [inlined]
 [5] _reduce at /Users/jbieler/.julia/packages/StaticArrays/1g9bq/src/mapreduce.jl:204 [inlined] (repeats 2 times)
 [6] #sum#275 at /Users/jbieler/.julia/packages/StaticArrays/1g9bq/src/mapreduce.jl:229 [inlined]
 [7] sum(::StaticArrays.SArray{Tuple{3},Interval{Interval{Float64}},1,3}) at /Users/jbieler/.julia/packages/StaticArrays/1g9bq/src/mapreduce.jl:229
 [8] (::getfield(Main, Symbol("##21#22")))(::StaticArrays.SArray{Tuple{3},Interval{Float64},1,3}) at ./REPL[9]:1
 [9] #minimise#1(::Type, ::Float64, ::Float64, ::Function, ::getfield(Main, Symbol("##21#22")), ::IntervalBox{3,Float64}) at /Users/jbieler/.julia/dev/IntervalOptimisation/src/optimise.jl:41
 [10] minimise(::Function, ::IntervalBox{3,Float64}) at /Users/jbieler/.julia/dev/IntervalOptimisation/src/optimise.jl:22
 [11] top-level scope at none:0

dpsanders · December 18, 2019, 3:04pm

Sorry, make that the nonrecursive_powers branch then.

jonathanBieler · December 19, 2019, 10:00am

That branch is almost 10x faster indeed. I went a bit further with the number of iterations and IntervalOptimisation does very well.

mean_succ_3
mean_succ_6

baggepinnen · December 19, 2019, 10:56am

Wall-clock time would arguably be more interesting than number of function evaluations for the end user?

jonathanBieler · December 19, 2019, 11:20am

Wall-clock time depends on the error function (here they are very fast compare to real world cases), so it’s better to count the number of function evaluation and to measure the overhead of the algorithm separately.

For the methodology I’m mostly following : https://coco.gforge.inria.fr/COCOdoc/bbo_experiment.html

baggepinnen · December 19, 2019, 11:23am

Yeah, but isn’t this case a bit different since the function itself takes longer to evaluate due to the intervals? For most other optimization algorithms the function evaluation takes a fixed amount of times, but called with intervals, this is not true.

To be fair, the number oräf arithmetic operations required to propagate intervals through the function compared to a simple floating point value should be reflected in the results somehow.

pkofod · December 19, 2019, 11:45am

I agree, the reason why you usually report f-calls is because it’s assumed to be the same. It’s not exactly the same, but it would be like comparing accuracy of a Float64 and Bigfloat run and only counting number of calls. I’m still impressed by the quality of the interval method, but it could potentially be much more expensive (especially in higher dimensions I think?)

ChrisRackauckas · December 19, 2019, 12:15pm

It’s higher dimensions, and the expense scale with the condition of the method. The more ill-conditioned the f is, the more divides it needs to do, until it needs to start using BigFloats due to the blow up speed. I think it might be safe to say though that if intervals work on someone’s problem, it should be preferred though.

pkofod · December 19, 2019, 12:43pm

I agree it’s an awesome technique, but if you look at that plot, then time per f does matter. They appear to be “equally expensive”, but it’s not clear that that is true. Look at D: 6, 1:14 for example. They seem to be equally good for a high value of run length, but it’s not clear to me if 1e4 for BBO is equivalent to 1e3 for IO or if it’s 1e2? If it’s the latter, then BBO seems to take the lead. That’s why wall time would be a nice supplement.

Also, is it possible that IO is better without NelderMead?

jonathanBieler · December 19, 2019, 2:01pm

It does pretty badly if I don’t chain it into NelderMead ; it gets close to the global minimum but final convergence doesn’t seem very good. That said that depend on the metric you use to measure performance.

For the runtime I noticed that it increases non-linearly with the number of iterations, which means that indeed it will become an issue at some point :

runtime

julia> f = BBOB.F15
Rastrigin Function

plot(
    y = [@elapsed minimise(f, IntervalBox(-5.5..5.5, 6), f_calls_limit = calls ) for calls in 10:1000:10_000],
    Geom.line, Guide.ylabel("Runtime [s]")
)

dpsanders · December 20, 2019, 5:07am

What do you mean by “chain it into melder mead”?

I agree that interval evaluations are more expensive - maybe 10 times more expensive than standard floating point to evaluate a given f (though that of course depends what f looks like and it would be good to benchmark that carefully).

jonathanBieler · December 20, 2019, 1:12pm

At the end of the optimization I run a NelderMead starting from the minimum and using a % of the function evaluation budget. That allows algorithms that have good global properties but poor final convergence to have decent performance in the benchmark.

dpsanders · May 29, 2020, 5:27pm

That should hopefully be fixable using the “mean-value form”.
Basically interval optimization suffers from a “cluster effect”, where you start producing more and more small boxes near the minimum.

If the function is differentiable then using a 1st-order Taylor approximation fixes this problem.

Which test function did you use?

jonathanBieler · June 8, 2020, 1:27pm

In the plots above I’ve used 14 test functions from the BBOB set. To check the final convergence one should plot the function value in function of the number of iterations, some algorithms do good initially but then slow down and reach a plateau while other continue to decrease at a faster pace.

microprediction · October 23, 2020, 4:39pm

This might inform somewhat. For example, is SHGO in Julia yet?

stevengj · October 23, 2020, 5:45pm

You can always call the Scipy version via PyCall. Unless you are optimizing an extremely cheap function, the inter-language call overhead is probably irrelevant.

Topic		Replies	Views
Trying variety of Solvers Optimization (Mathematical)	2	715	July 24, 2020
Optim: What optimiser is best if your gradient computation is slow? Optimization (Mathematical) optim	13	3145	August 22, 2017
Which optimization API/Pkg for simple global optimization Optimization (Mathematical)	8	1872	January 29, 2017
Comparison of derivative-free / black-box optimization algorithms in Julia Optimization (Mathematical) question , comparison	12	2680	February 14, 2024
Global optimization: Simulated Method of Moments Optimization (Mathematical) question	9	4017	July 26, 2019

Latest recommendations for global optimization

Related topics