Hello, I need to use some individual elements (x_n, in my application agricultural farms of which I know every details) as building blocks to reconstruct an observed aggregate V_d >=0 for the d dimensions (in my application the agricultural system as we know from census data… hectars of lands, number of farms, cows and so on).
I want to find the minimum number of K non-zero weights so that my distance between the reconstructed aggregated and the observed one is below a threshold.
In mathematical terms the problem can be expressed as:
min_{x_n} K
s.t.
x_n \geq 0
\frac{\sqrt{\sum_d((\sum_n x_n * a_{n,d}) - V_d)^2}}{\sum_d V_d} \leq \epsilon
\sum_n \mathbf{1}(X_n > 0) = K
where the last line in an indicator function.
I can simplify the problem by just minimising the distance given K and then simply check which K is within my tolerance:
min_{x_n} \sum_d((\sum_n x_n * a_{n,d}) - V_d)
s.t.
x_n \geq 0
\sum_n \mathbf{1}(X_n > 0) \leq K
When I try to implement it in Jump however I have a problem with the Hessian (AssertionError: length(hess_I) > 0
) in both cases:
using DelimitedFiles
coeff = readdlm(download("https://lef-nancy.org/files/index.php/s/gkxxYxqWcGCYAEM/download"),',') # Few KB
POP = readdlm(download("https://lef-nancy.org/files/index.php/s/4dwiDtDfyoC5mbH/download"),',') # Few Bytes
using JuMP, Ipopt
(N,D) = size(coeff)
ϵ = 0.1
# Rescaling POP to 1 on all dimensions...
scCoeff = 1 ./ POP
POP = ones(D)
coeff = coeff .* scCoeff'
sumPOP = sum(POP[d] for d in 1:D)
upscaling = Model(optimizer_with_attributes(Ipopt.Optimizer)) # would require "using Ipopt" instead of GLPK
@variables upscaling begin
x[1:N] >= 0;
K >= 0
end
set_start_value.(x, 1)
set_start_value.(K, 100)
@NLobjective upscaling Min begin
K
end
@NLconstraints upscaling begin
sqrt(sum( (sum(coeff[n,d]*x[n] for n in 1:N) - POP[d])^2 for d in 1:D))/sumPOP <= ϵ
end
@NLconstraints upscaling begin
sum( (x[n] > 0.0) for n in 1:N) == K
end
optimize!(upscaling)
upscaling2 = Model(optimizer_with_attributes(Ipopt.Optimizer)) # would require "using Ipopt" instead of GLPK
@variables upscaling2 begin
x[1:N] >= 0;
end
set_start_value.(x, 1)
@objective upscaling2 Min begin
sum( (sum(coeff[n,d]*x[n] for n in 1:N) - POP[d])^2 for d in 1:D)
end
@NLconstraints upscaling2 begin
sum( (x[n] > 0.0) for n in 1:N) <= 5
end
optimize!(upscaling2)
Note that if I remove the constraint on K the model find a (local) minima without problems…
Is there a more specific way to solve these kind of problems ?