I have some specific bounds on my Neural Network weights and biases, that cannot be expressed as simple parameter inequlality constraints. That’s why I would like to try something like IPNewton for the Optimization, even though it is slower and possibly not converging.
The problem now is that IPNewton accepts arrays for the optimization, whereas Lux.jl accepts ComponentVectors of a special form. Specifically, to convert an array of weights and biases in the Optimization routine to a ComponentVector is difficult. I wrote my own function for this. But unfortunately, I am having problems with AutoDiff and mutations.
Is there any easy, feasible way to convert arrays or Vectors to a structure that a Lux NN can accept as parameters ? This is my current function:
function convert_params_to_tuple_no_ode(p::Vector{Float64}, n_in::Int64, n_out::Int64, hidden_layers::Tuple{Int64, Int64})
# Create the layers tuple: input layer, hidden layers, output layer
layers = (n_in, fill(hidden_layers[2], hidden_layers[1])..., n_out)
final_idx = 1
nn_subtuple = NamedTuple()
for layer_nr in 1:(length(layers)-1)
weight_size = layers[layer_nr] * layers[layer_nr + 1]
bias_size = layers[layer_nr + 1]
# Extract weights and biases from the parameter vector
weights = p[final_idx:(final_idx + weight_size - 1)]
final_idx += weight_size
weights = reshape(weights, layers[layer_nr + 1], layers[layer_nr])
biases = p[final_idx:(final_idx + bias_size - 1)]
final_idx += bias_size
# Create a vector for biases
bias_vec = biases # Bias is usually a vector, not a matrix
# Create a named tuple for the current layer
subtuple_layer = (weight = weights, bias = bias_vec)
layer_symbol = Symbol("layer_", layer_nr)
# Create a temporary tuple for the current layer and merge it into nn_subtuple
temp_tuple = (layer_symbol => subtuple_layer,)
nn_subtuple = (;nn_subtuple, temp_tuple)
end
return ComponentArrays.ComponentVector((ps_lux = nn_subtuple,))
end