Choosing an optimizer

ufechner7 · January 26, 2024, 5:40pm

I have the following optimization problem:

Tuning of a filter, that has two (real value) parameters, subject to one constraint (the time delay). My simulation returns two parameters, the standard deviation of the difference between the estimated and the correct value which shall be minimized, and the time delay which must match the constraint.

Each simulation (function evaluation) needs about 15s, so it is expensive.

I think the problem is convex.

Which optimization package/ algorithm would you suggest?

PeterSimon · January 26, 2024, 5:48pm

If no derivatives are available, I suggest looking at Nomad. It is designed for expensive black box problems that have a limited evaluation budget, and it supports constraints.

gabo-di · January 27, 2024, 1:56pm

If you want you can try this code. It uses Artificial Bee Colony algorithm, I coded it some years ago and for non differentiable problems usually gave me good results

using StaticArrays
using Random

SEED = 1234
rng = Xoshiro(SEED)

abstract type ABC_SolInfo end

"""
This code is for Artificial Bee Colony optimization, ABC, it tries to find the minimum of a function
it is presented in the paper:
 [1]   M. Mernik, SH. Liu, D. Karaboga, M. Crepinsek, 
       On clarifying misconceptions when comparing variants of the
       Artificial Bee Colony Algorithm by offering a new implementation

it is an implementation of ABC_imp1 of the appendix        
"""


Base.@kwdef struct ABC_imp1{T,N,SN,info<:ABC_SolInfo}

    id::Int                                                 # id number
    fitness::Function                                       # Fitness function 
    MFE::Int                                                # Maximum number of fitness evaluations
    limit::Int                                              # Maximum number of trial for abandoning a source
    nf_eval::MVector{1,Int}                                 # number of evaluations of fitness function
    debug::Bool

    position_min::MVector{N,T}
    position_max::MVector{N,T}    
    
    positions::SizedVector{SN,MVector{N,T}}                 # Positions of bees
    fit_pop::MVector{SN,Float64}                            # fitness of population
    add_fit_pop_info::SizedVector{SN,info}                      # additional info to fit_pop, as time of flight to get the value

    best_position::MVector{N,T} = zeros(MVector{N,T})       # best position ever recorded
    fitness_best::MVector{1,Float64} = [Inf]                # fitness of best posittion ever recorded 
    add_fitness_best_info::SizedVector{1,info} =  [zero(info)]  # additional info to fitness_best, as time of flight to get the value 
                     
    trials::MVector{SN,Int} = zeros(MVector{SN,Int})        # number of trials of each employed bee 

    history_fit::Array{Tuple{Int64, Float64}, 1} = Tuple{Int,Float64}[]

    early_end::Bool = false
    val_early_end::Float64 = 0
    tol_f::Float64 = 1e-2
end    

length_pop(abc::ABC_imp1{T,N,SN,info}) where {T,N,SN,info} = SN

length_par(abc::ABC_imp1{T,N,SN,info}) where {T,N,SN,info} = N


function init_ABC!(info::Type{<:ABC_SolInfo}, min_limits::MVector{N,T}, max_limits::MVector{N,T}, fitness::Function; SN::Int=100, MFE::Int=1000, id::Int=1,
     debug::Bool=false, limit_=nothing, val_early_end_=nothing, tol_f_=nothing, other_sol::Union{Nothing, Array{MVector{N,T},1}}=nothing ) where {T,N}
    """
    min_limits     is the minimum values for postion (elementwise)
    max_limits     is the maximum values for position (elementwise)
    SN             is the Number of Foods
    MFE            Maximum number of fitness evaluations
    limit          Maximum number of trial for abandoning a source
        max_limits[i] ≥ position[i] ≥ min_limits[i]; i = length(arr)
    """

    positions = zeros(SizedVector{SN,MVector{N,T}})
    fit_pop = zeros(MVector{SN,Float64})
    add_fit_pop_info = zeros(SizedVector{SN,info})  
    nf_eval = zeros(MVector{1,Int})

    if !(isnothing(other_sol))
        l = length(other_sol)
        for s in 1:l
            vec_setting!(positions[s], other_sol[s])
            
            nf_eval[1] += 1
            fit_pop_i, add_fit_pop_info_i = fitness(positions[s])
            fit_pop[s] = fit_pop_i
            add_fit_pop_info[s] = add_fit_pop_info_i
        end
        l += 1
    else
        l = 1
    end    

    for s in l:SN
        rand_setting!(positions[s], max_limits, min_limits)

        nf_eval[1] += 1
        fit_pop_i, add_fit_pop_info_i = fitness(positions[s])
        fit_pop[s] = fit_pop_i
        add_fit_pop_info[s] = add_fit_pop_info_i
    end    

    if isnothing(val_early_end_)
        early_end = false
        val_early_end = 0
    else
        early_end = true
        val_early_end = val_early_end_
    end
    
    if isnothing(limit_)
        limit = SN*N
    else
        limit = limit_
    end        


    if isnothing(tol_f_)
        return ABC_imp1{T,N,SN,info}(
             id = id,
             fitness = fitness, 
             MFE = MFE,
             limit = limit,
             nf_eval = nf_eval, 
             debug = debug,
             position_min = min_limits, 
             position_max = max_limits,
             positions = positions,
             fit_pop = fit_pop, 
             add_fit_pop_info = add_fit_pop_info,
             early_end = early_end,
             val_early_end = val_early_end             
             )
    end    

    return ABC_imp1{T,N,SN,info}(
             id = id,
             fitness = fitness, 
             MFE = MFE,
             limit = limit,
             nf_eval = nf_eval, 
             debug = debug,
             position_min = min_limits, 
             position_max = max_limits,
             positions = positions,
             fit_pop = fit_pop, 
             add_fit_pop_info = add_fit_pop_info,
             early_end = early_end,
             val_early_end = val_early_end,
             tol_f = tol_f_  )
end


function init_array_ABC!(info::Type{<:ABC_SolInfo}, M::Int, min_limits::MVector{N,T}, max_limits::MVector{N,T}, fitness::Function; SN::Int=100, MFE::Int=1000,
    debug::Bool=false, limit_=nothing, val_early_end_=nothing, tol_f_=nothing, other_sol::Union{Nothing, Array{MVector{N,T},1}}=nothing  ) where {T,N}
   """
   M              number of similar problems to evolve
   min_limits     is the minimum values for postion (elementwise)
   max_limits     is the maximum values for position (elementwise)
   SN             is the Number of Foods
   MFE            Maximum number of fitness evaluations
   limit          Maximum number of trial for abandoning a source
       max_limits[i] ≥ position[i] ≥ min_limits[i]; i = length(arr)
   """
  
    abc = ABC_imp1{T,N,SN,info}[]
    for m in 1:M
        abc_m = init_ABC!(info, min_limits, max_limits, fitness; SN, MFE, debug, limit_, val_early_end_, tol_f_, other_sol)
        push!(abc, abc_m)
    end

    return SizedVector{M}(abc)        
end


function vec_setting!(position::MVector{N,T}, vec::MVector{N,T}) where{T,N}
    for jj in 1:N
        position[jj] = vec[jj]
    end
    return nothing    
end  


function rand_setting!(position::MVector{N,T}, max_limits::MVector{N,T}, min_limits::MVector{N,T}) where{T,N}
    for jj in 1:N
        position[jj] = rand(rng)*(max_limits[jj] - min_limits[jj]) + min_limits[jj]
    end
    return nothing    
end    


function employed_bee!( v_position::MVector{N,T}, position_i::MVector{N,T}, position_k::MVector{N,T}, position_min::MVector{N,T}, position_max::MVector{N,T}; λ::Float64=1.0 ) where {T,N}
    j = rand(rng, 1:N)   # parameter to be actualized
    
    for jj  in 1:N
        if jj != j     
            v_position[jj] = position_i[jj]
        else    
            v_position[jj] = position_i[jj] + (-1+2*rand(rng))*(position_i[jj] - position_k[jj])*λ
            if !(position_min[jj] <= v_position[jj] <= position_max[jj] )
                v_position[jj] = rand(rng)*(position_max[jj] - position_min[jj]) + position_min[jj]
            end    
        end
    end
    return nothing
end    


function getK(i::Int, SN::Int)
    return rand(rng, [j for j in 1:SN if j!=i ])
end    


function actualize_probabilities!(prob::MVector{SN,Float64}, fit_pop::MVector{SN,Float64}) where {SN}
    for ii in 1:SN
        if fit_pop[ii] > 0            
            prob[ii] = 1 / (1 + fit_pop[ii])
        else
            prob[ii] = 1 + abs( fit_pop[ii] )
        end         
    end    
    aux = sum(prob)
    prob .= prob ./ aux
    return nothing
end    


function evolve_ABC!(abc::ABC_imp1{T,N,SN,info}) where {T,N,SN,info}
    """
    we evolve the population according to ABC algorithm
    """

    v_position = zeros(MVector{N,T})
    prob = zeros(MVector{SN,Float64})

    while true
        # employed bees phase
        for s in 1:SN
            k = getK(s, SN)
            employed_bee!( v_position, abc.positions[s], abc.positions[k], abc.position_min, abc.position_max)
        
            abc.nf_eval[1] += 1
            fit_pop_i, add_fit_pop_info_i = abc.fitness(v_position)
            
            if fit_pop_i < abc.fit_pop[s]
                abc.positions[s] = deepcopy(v_position)
                abc.fit_pop[s] = fit_pop_i
                abc.add_fit_pop_info[s] = add_fit_pop_info_i
                abc.trials[s] = 0
            else
                abc.trials[s] += 1    
            end

            if end_condition!(abc)
                return nothing
            end        
        end  
        
        # actualize probabilities
        actualize_probabilities!(prob, abc.fit_pop)
        
        # Onlooker bee phase
        for t in 1:SN
            r = rand(rng)
            rs = 0.0
            for s in 1:SN
                rs += prob[s]                
                if r <= rs
                    k = getK(s, SN)
                    employed_bee!( v_position, abc.positions[s], abc.positions[k], abc.position_min, abc.position_max)
    
                    abc.nf_eval[1] += 1
                    fit_pop_i, add_fit_pop_info_i = abc.fitness(v_position)
        
                    if fit_pop_i < abc.fit_pop[s]
                        abc.positions[s] = deepcopy(v_position)
                        abc.fit_pop[s] = fit_pop_i
                        abc.add_fit_pop_info[s] = add_fit_pop_info_i
                        abc.trials[s] = 0
                    else
                        abc.trials[s] += 1    
                    end
                    if end_condition!(abc)
                        return nothing
                    end 
                    break
                end
            end
        end                       

        
        # Scout bee phase
        for s in 1:SN
            if abc.trials[s] >= abc.limit
                rand_setting!(abc.positions[s], abc.position_max, abc.position_min)

                abc.nf_eval[1] += 1
                fit_pop_i, add_fit_pop_info_i = abc.fitness(abc.positions[s])
                abc.fit_pop[s] = fit_pop_i
                abc.add_fit_pop_info[s] = add_fit_pop_info_i
                abc.trials[s] = 0

                if end_condition!(abc)
                    return nothing
                end
            end    
        end 
        
        if end_condition!(abc)
            return nothing
        end
    end
end


function evolve_array_ABC!(abc::SizedVector{M,ABC_imp1{T,N,SN,info}}) where {T,N,SN,M,info}
    """
    we evolve the population according to ABC algorithm
    """
    Threads.@threads for m in 1:M
        evolve_ABC!(abc[m])
    end
end


function end_condition!(abc::ABC_imp1{T,N,SN}) where {T,N,SN}

    idx = argmin(abc.fit_pop)

    if abc.fit_pop[idx] < abc.fitness_best[1]
        abc.best_position .= deepcopy(abc.positions[idx])
        abc.add_fitness_best_info[1] = abc.add_fit_pop_info[idx]
        abc.fitness_best[1] = abc.fit_pop[idx]
        if abc.debug
            println(abc.nf_eval[1], "   ", abc.fitness_best[1],"   ",abc.id)                    
        end   
        push!(abc.history_fit, (abc.nf_eval[1], abc.fitness_best[1])) 
    end

    if abc.early_end
        if abs(abc.val_early_end - abc.fitness_best[1] ) < abc.tol_f
            return true
        end
    end        

    if abc.nf_eval[1] >= abc.MFE
        return true
    else
        return false
    end        
end    

"""
Example of use
"""
struct someinfo <:ABC_SolInfo
    x::Float64
end

import Base.zero
Base.zero(::Type{someinfo}) = someinfo(0.0)

function main_ABC()
    function fittt(par::MVector{N,T}) where {N,T}
        return sum(cos.(par)), someinfo(π)
    end

    N = 2
    min_arr = MVector{N,Float64}([ -10 for ii in 1:N ])
    max_arr = MVector{N,Float64}([ 10 for ii in 1:N ])
    abc = init_ABC!(someinfo, min_arr, max_arr, fittt, debug=false, SN=5, MFE = 100)
    evolve_ABC!(abc)
    return abc
end


function main_array_ABC()
    function fittt(par::MVector{N,T}) where {N,T}
        return sum(cos.(par)), someinfo(π)
    end

    N = 2
    M = 3
    min_arr = MVector{N,Float64}([ -10 for ii in 1:N ])
    max_arr = MVector{N,Float64}([ 10 for ii in 1:N ])
    abc = init_array_ABC!(someinfo, M, min_arr, max_arr, fittt, debug=false, SN=5, MFE = 100)
    evolve_array_ABC!(abc)
    return abc
end

The idea is that the function you want to optimize can give you two results. The value that you want to optimize, the standard deviation. And some additional info, the time delay.

I think with some changes in the code you can put more weight on the “bees” that are near the time delay constraint.

ufechner7 · January 27, 2024, 7:05pm

I tried:

# optimize the standard deviation by changing the parameter f_c 
# using the NOMAD optimizer
function optimize(estimator::Symbol=:integrator)
    # std_u(:integrator; f_c=7)
    function f(x)
        return std_u(:integrator; f_c = x[1])
    end

    function c(x)
        return 0.001 - x[1]
    end

    function eval_fct(x)
        bb_outputs = [f(x), c(x)]
        success = true
        count_eval = true
        return (success, count_eval, bb_outputs)
    end

    pb = NOMAD.NomadProblem(1, # number of inputs of the blackbox
                      1, # number of outputs of the blackbox
                      ["OBJ"], # type of outputs of the blackbox
                      eval_fct;
                      lower_bound=[0.001],
                      upper_bound=[10.0])
    result = NOMAD.solve(pb, [0.2])
end

I get the error message:

Standard deviation of Uest: 0.0405 m/s
Warning: EvaluatorControl: Point #0 (   0.2              )      Evaluation OK    Undefined f     h =   0                     : Eval ok but f not defined. Setting eval status to EVAL_FAILED.
X0 evaluation failed for X0 = ( 0.2 )
A termination criterion is reached: No termination (all). Problem with starting point evaluation (Algo) No more points to evaluate

Best feasible solution:     Undefined.

Best infeasible solution:   Undefined.

Blackbox evaluations: 1
(x_best_feas = nothing, bbo_best_feas = nothing, x_best_inf = nothing, bbo_best_inf = nothing)

Any idea?

ufechner7 · January 27, 2024, 7:08pm

I tried this version, but it also fails:

function optimize(estimator::Symbol=:integrator)
    # std_u(:integrator; f_c=7)
    function f(x)
        return std_u(:integrator; f_c = x[1], plt=false)
    end

    function c(x)
        return 0.001 - x[1]
    end

    function eval_fct(x)
        bb_outputs = [f(x), c(x)]
        success = true
        count_eval = true
        return (success, count_eval, bb_outputs)
    end

    pb = NOMAD.NomadProblem(1, # number of inputs of the blackbox
                      2, # number of outputs of the blackbox
                      ["OBJ", "EB"], # type of outputs of the blackbox
                      eval_fct;
                      lower_bound=[0.001],
                      upper_bound=[10.0])
    result = NOMAD.solve(pb, [0.2])
end

PeterSimon · January 27, 2024, 7:22pm

Your second version looks correct to me. Does it fail in the same way?
The first thing that occurs to me is manually calling eval_fct with a few different input values (some obeying the constraint and some outside the constraint), and checking the returned value of bb_outputs.

ufechner7 · January 27, 2024, 7:41pm

julia> fct(0.2)
---> 0.01
---> 0.01
  9.237756 seconds (82.70 M allocations: 2.254 GiB, 9.34% gc time, 42.14% compilation time)
(true, true, [0.04053340469146625, -0.199])

julia> fct(0.001)
---> 0.01
 11.134402 seconds (75.03 M allocations: 1.776 GiB, 4.46% gc time)
(true, true, [0.10561139734773166, 0.0])

julia> fct(10)
---> 0.01
 11.648667 seconds (75.03 M allocations: 1.778 GiB, 4.72% gc time)
(true, true, [0.03409998897827958, -9.999])

julia> fct(0.0001)
---> 0.01
 11.967222 seconds (75.03 M allocations: 1.773 GiB, 2.77% gc time)
(true, true, [0.10444898673448678, 0.0009])

No issue…

It is working now, but might take some time to give a result. Thank you!

ufechner7 · January 27, 2024, 9:45pm

Works quite nice:

# optimize the standard deviation by changing the parameters f_c and ku
# using the NOMAD optimizer
function optimize(estimator::Symbol=:integrator)
    f(x) = std_u(:integrator; f_c = x[1], ku = x[2])
    function eval_fct(x)
        bb_outputs = [f(x)]
        success = true
        count_eval = true
        success, count_eval, bb_outputs
    end

    pb = NOMAD.NomadProblem(2, # number of inputs of the blackbox
                            1, # number of outputs of the blackbox
                            ["OBJ"], # type of outputs of the blackbox
                            eval_fct;
                            lower_bound=[0.001, 1e-7],
                            upper_bound=[10.0,  1e-5])
    pb.options.max_time = 60*5 
    NOMAD.solve(pb, [0.2, 4.5e-6])
end

I get a reasonable result in 5 min for two variables, even though each function call takes 15s…

Question: Is it possible to use multiple cores and execute the function calls in parallel?

PeterSimon · January 27, 2024, 11:37pm

No, you can multi-thread within your objective function but AFAIK NOMAD does not support multiple simultaneous objective function calls via threading.

CMAEvolutionStrategy.jl is a powerful global optimizer that does support multi-threaded parallel evaluation of the objective, and there are other optimizers in Metaheuristics that do as well, but these population-based global optimizers typically require far more function evaluations because they search the variable space very thoroughly. For example, CMAEvolutionStrategy would default to a population size of 6 for your 2-variable case, and would by default allow up to 20,000 objective function evaluations. Even if you reduced this to 2,000, assuming 8 threads, perfect scaling, and 15 seconds per evaluation, you’re looking at about an hour. Of course, you might be satisfied with a less than fully optimized result. I guess you’ll have to experiment and see what the trade-offs are for your specific case. Maybe a wrapper like Optimization.jl or NonConvex.jl would be useful for doing this experimentation with the least modifications to your calling code.

PeterSimon · January 28, 2024, 3:39am

I looked over the original C++ version of NOMAD. It does indeed support parallel evaluation of the objective function at multiple points, as documented here. It also has an optional text file interface, so you could launch it from Julia, communicate with it via text files, and run multiple objective evaluations simultaneously. It might be worth the effort for such an expensive objective function, if you have to solve it many times.

ufechner7 · January 28, 2024, 12:45pm

Well, for now I am happy with the serial version. If I run many simulations for getting the Pareto front of a multi-objective optimization I can still run different NOMAD simulations in parallel (I hope), with different objective functions…

Small improvement of my code:

        f2(x) = std_kalman(se; c_ω=x[1], sigma_p=x[2], plt=false)
        function eval_fct2(x)
            local bb_outputs, success
            try
                bb_outputs = [f2(x)]
                success = true
            catch ex
                bb_outputs = [1000]
                success = false
                println("Failure for x=$x !")
                if isa(ex, InterruptException)
                    rethrow(ex)
                end
            end
            count_eval = true
            success, count_eval, bb_outputs
        end
        pb = NOMAD.NomadProblem(2, # number of inputs of the blackbox
                                1, # number of outputs of the blackbox
                                ["OBJ"], # type of outputs of the blackbox
                                eval_fct2;
                                lower_bound=[0.2, 30000],
                                upper_bound=[1.7, 120000])
        pb.options.max_time = 60*3
        pb.options.max_bb_eval = 20 
        pb.options.display_stats = ["TIME", "BBE", "SOL", "OBJ"]
        NOMAD.solve(pb, [0.55, 97000])

Solving the objective function can fail with an exception if unsuitable parameters are provided by NOMAD. These exceptions need to be caught, but I still want to be able to hit <crtl><c> to terminate the optimization…

Perhaps such an example could be added to the documentation?

Topic		Replies	Views
Latest recommendations for global optimization Optimization (Mathematical) package	92	13032	August 23, 2022
Need a combinatorial optimizer for black box function Optimization (Mathematical) combinatorics	6	1435	September 27, 2021
Nonlinear optimization using MathOptInterface - examples/learning Optimization (Mathematical)	8	1877	April 19, 2019
A performant optimizing algorithm for a case with an expensive cost function and parallel execution Optimization (Mathematical) question	4	133	May 11, 2025
Help to use AD with Optimization.jl Optimization (Mathematical) optimization , ad	5	116	May 9, 2025

Choosing an optimizer

Related topics