How to make a user-defined objective function with mulitple vector variable and parameter arguments?

I would like to optimize a set of 3 variable vectors to minimize the total variance in a dataset. I think my problem is in passing variable and parameter vectors into my user-defined objective.

I found this question on StackOverflow, but I don’t know how to use it with several variable and parameter vectors.

Here’s the commented code with sample data, maybe self-explanatory?

# User-defined objective:
# 👇 This is where I don't know what I'm doing 👇

# I don't know how to pass the variables and parameters,
# and I don't know how to access them from within the function,
# and I need γ to have an integer constraint, I plan to simply use floor() once I can access the variable within this objective.
function xy_objective(α..., β..., γ..., Y_0..., nan_pad_fraction)

# 👆 This is where I don't know what I'm doing 👆
	
	n_p, n_r, n_t = size(Y_0)
	n_nans = convert(Int,round(nan_pad_fraction*n_p))

	# 1. circshift each column by its γ (by same amount for all targets)
	nan_pad = replace!(zeros(n_nans,n_r,n_t), 0=>NaN)
	Y_0_padded = vcat(nan_pad, Y_0, nan_pad)
	Y_0_shift = Y_0
	for i_r = 1:n_r
		Y_0_shift[:,i_r,:] = circshift(Y_0_padded[:,i_r,:], γ[i_r])
	end

	# 2. take nanmean of each target to reduce variance against target mean
	ȳ = reshape(nanmean(Y_0_shift,dims=2),n_p + 2*n_nans, n_t)
	
	# 3. calculate cost
	χ²_t = zeros(n_t)
	for i_t = 1:n_t
		χ²_t = 
		sum(Y_0_shift[:,:,i_t] - 
			(reshape(α,(1,n_r)) .+ ȳ[:,i_t]*reshape(β,(1,n_r))).^2)
	end		

	χ² = sum(χ²_t)
	return χ²
end
#################################################################################
# Optimization:
	
# Sample data
n_pts = 1000 # number of data sample points per experiment
n_reps = 10 # number of experimental replicates
n_targ = 2 # number of experimental targets (simultaneously acquired in each rep)
	
Y_0 = rand(n_pts, n_reps, n_targ) # data matrix

# Goal: reduce total variance by:
# 	adding α to each column
# 	scaling scaling by β for each column
# 	shifting by γ for each target

# Pad edges with NaN values so data can shift 
np_frac = 0.1 
# e.g. a vector with 100 elements along the first dimension will have 100*np_frac NaNs added to beginning and end to become 100*(1+np_frac) elements long in the first dimension
	
# Build model
m = Model(Ipopt.Optimizer)
	
register(m, 
	:xy_objective, 
	n_reps*(2*n_targ + 1), 
	xy_objective; autodiff=true)

α_test, β_test, γ_test = nothing, nothing, nothing
nan_pad_fraction, Y_0_prob = nothing, nothing
		
# α: additive variable, 1 for each replicate for each target
@variable(m, 
	α_test[i=1:n_reps,j=1:n_targ] == 0)
# β: scaling variable, 1 for each replicate for each target
@variable(m, 
	β_test[i=1:n_reps,j=1:n_targ] == 1)
# γ: shifting variable, 1 for each replicate (shift all targets by same amount)
@variable(m, 
	-convert(Int,floor(np_frac*n_pts)) <= 
	γ_test[i=1:n_reps] <= 
	convert(Int,floor(np_frac*n_pts)) ) # also need integer constraint
	
# Fraction of data length to allow for shifting
@NLparameter(m, 
	nan_pad_fraction == np_frac)
# Starting point for experimental data
@NLparameter(m, 
	Y_0_prob[i=1:n_pts,j=1:n_reps,k=1:n_targ] == Y_0[i,j,k])

# 👇 This is where I don't know what I'm doing 👇

# I don't know how to pass the variables and parameters,
# and I don't know how to access them from within the function.
@NLobjective(m, 
	Min, 
	xy_objective(α_test..., β_test..., γ_test..., Y_0_prob..., nan_pad_fraction))
	
# 👆 This is where I don't know what I'm doing 👆

optimize!(m)
1 Like

A few things:

  • JuMP is not really designed for nonlinear optimization with vector inputs. Not to say it is impossible, but it really isn’t its goal. Using something like NLopt, Optim, or the wrapper GalacticOptim is a little more convenient. GalacticOptim in particular has lots of auto-differentiation options which can help speed things up.
  • If you do that, then there are fancy ways to split up your underlying vectors (e.g. something called https://github.com/jonniedie/ComponentArrays.jl) but you probably just want to start by having a single large vector as the input to your function to be optimized and then manually access it within the function.
2 Likes

You need to write xy_objective as a single function xy_objective(args...).

However even once you do that, you’ll have a few problems.

  • Why have α_test and β_test as variables? You define them to be constant.
  • Same for your parameters: Nonlinear Modeling (Legacy) · JuMP. Parameters are really only useful if you are solving multiple problems in a sequence.
  • Ipopt expects problems to be convex and twice-differentiable. Using floor breaks that. You could define γ_test as Int (@variable(m, γ_test[1:n_reps], Int). But then you’d need a MINLP solver like Juniper GitHub - lanl-ansi/Juniper.jl: A JuMP-based Nonlinear Integer Program Solver
  • Step 3 in your function doesn’t make sense. χ²_t gets overwritten every iteration of the for loop.
  • replace!(zeros(n_nans,n_r,n_t), 0=>NaN) use fill(NaN, n_nans, n_r, n_t)

I don’t really understand your code. But based on

# Goal: reduce total variance by:
# 	adding α to each column
# 	scaling scaling by β for each column
# 	shifting by γ for each target

it seems like you want something like

using JuMP, Ipopt

n_points = 1_000
n_replications = 10
n_targets = 2
data = rand(n_points, n_replications, n_targets)

model = Model(Ipopt.Optimizer)
@variable(model, α[1:n_targets])
@variable(model, β[1:n_targets])
@expression(
    model, 
    prediction[i = 1:n_points, j = 1:n_replications, k = 1:n_targets], 
    α[k] + β[k] * data[i, j, k],
)
@expression(model, μ, sum(prediction) / length(prediction))
@objective(model, Min, sum((prediction .- μ).^2) / length(prediction))
optimize!(model)

Edit: I’ll note that this obviously isn’t exactly what you’d what, because β would just go to 0.

2 Likes

The caveat I should have added is that if your model fits into the standard JuMP framework without all sorts of complicated registered nonlinear functions of vectors, then use JuMP.

I think part of what @odow is saying is that it may fit more into a modeling language like JuMP than you realize. If so, then JuMP is tough to beat and ignore my previous comments.

1 Like

Thanks, Oscar!

Why have α_test and β_test as variables? You define them to be constant.

Typo: I thought I was initializing them to values of 0 and 1. Is there a way to use set_start_value() to work with variable vectors?

Same for your parameters: Nonlinear Modeling · JuMP. Parameters are really only useful if you are solving multiple problems in a sequence.

I just want to be able to pass the data Y_0 and nan_pad_fraction into the objective from the outside. Maybe a fixed variable would be appropriate? Otherwise, I’m unclear of why a fixed variable would be used.

Ipopt expects problems to be convex and twice-differentiable. Using floor breaks that.

I just need γ_test to contain integers that shift only within the bounds of the size of the NaN pads added to the data. I’ll look into Juniper, and I see Alpine.jl might be applicable too.

Step 3 in your function doesn’t make sense

Yep, that another typo. Meant to put χ²_t[i_t] = sum(...) inside the loop. This will get the SSE for each target, which is then added.

use fill(NaN, n_nans, n_r, n_t)

Thanks!

The suggested code you provided is actually pretty close to the non-shifting version that I have working. I just need the γ_test terms to shift the columns of data relative to one another. The start and end points for each column (for a pair of targets in this case) is chosen manually, so the shifting is to reduce misalignment. Sorry if it’s still unclear. It seems like Juniper or Alpine will help with this.

By the way, is there somewhere I’m missing that lists the options for an inventory of solvers available in JuMP?

I don’t know much about the syntax of JuMP, but that kind of problem is usually solved with closures.

  • Use start = . Documentation: Variables · JuMP
  • As others have said, no need to pass them in. Use a closure:
const np = 0.1

function foo(x)
    # no need to pass `np` in as an argument
    return np * x
end
  • So the gamma shift is about aligning the replicates? And each replicate can be shifted a different amount? The standard way to do this is to make N copies of the data, each shifted by a preset amount, and then add some binary variables to make sure you select one as the “true” data. You could probably just enumerate all possible shifts and solve an optimization problem for each shift. Ipopt (or Juniper) is going to really struggle with the formulation you have as written above.

  • there somewhere I’m missing that lists the options for an inventory of solvers available in JuMP?

The list of solvers is here: Installation Guide · JuMP. Options for each solver are solver-specific. You’d have to consult each solvers documentation for more information.