Hi there, I wasn’t sure whether I should have posted this in Statistics or in Optimization, but here it goes.
I have a peculiar situation. I want to use Hyperopt.jl optimize a function quality(x::Vector{Int})
. The function inputs a vector of integers that are indices for the columns of a large matrix. The function uses these columns and outputs a real number q
. I want to find the maximum q
possible given various x
. However, I want to keep length(x)
within reasonable bounds, probably at most up to 4. Said differently, the problem I am trying to optimize is: which columns of the matrix when combined give the best “quality”. (for context, here quality is the ability to separate the data into as many clusters as possible)
Here’s my difficulty. My quality
function takes into account the length(x)
and uses it as a weight: the higher the length(x)
the more the penalty to the quality. Let’s say I have an input matrix of 20 columns or so. Then, all possible input x
vectors derived from this matrix are:
julia> [binomial(20, i) for i in 1:4]
4-element Vector{Int64}:
20
190
1140
4845
julia> sum(ans)
6195
Initially I thought of collecting all these possible combinations into one very long 6195
-element vector of Vector{Int
}. But, as is well known, the number of elements in x
put massively more weight into combinations of x
with length (4) due to the scaling of the binomial coefficient.
Is there a way to perform such an optimization loop by putting more weight into smaller x
s? My quality function itself puts more weight into smaller x
s because it returns a smaller output for smaller x
. But my fear is that the small x
may not be sampled at all by the optimization algorithm because they are so few in comparison to the larger x
.
I would appreciate advice on how to tackle this either with Hyperopt.jl or some other optimization package. I could also write my own brute force loop that picks at random at most e.g. 100 possible x
s from each combinatorial combination of lengths 1, 2, 3 or 4, but I have already the code I could immediately wrap in the @hyperopt
macro.
cc @baggepinnen as the author of Hyperopt.jl maybe you have encountered already a scenario where different input parameters need to be weighted differently.