Does Optim not support function optimization with matrix variables?

F-YF · September 29, 2020, 8:25am

I want to optimize my function with Optim, but something went wrong . I have written an example. Does this example show that optim.jl does not support functions whose variables are matrices?If not, what should I do about it?

using Zygote
using Optim
using RecursiveArrayTools
function get_f(A)
M=reshape(1:16,4,4)
U=exp(A)
T=UMU’
return tr(T)
end
function my_function()
A0=zeros(4,4)
results=optimize(get_f,VectorOfArray([A0]))
end
my_function()

And the error message reads：

MethodError: no method matching +(::Array{Float64,2}, ::Float64)
For element-wise addition, use broadcasting with dot syntax: array .+ scalar
Closest candidates are:
+(::Any, ::Any, !Matched::Any, !Matched::Any…) at operators.jl:538
+(!Matched::Bool, ::T) where T<:AbstractFloat at bool.jl:103
+(!Matched::ChainRulesCore.DoesNotExist, ::Any) at C:\Users\38606.julia\packages\ChainRulesCore\trzfY\src\differential_arithmetic.jl:23
…
…
Stacktrace:
[1] simplexer(::Optim.AffineSimplexer, ::VectorOfArray{Float64,3,Array{Array{Float64,2},1}}) at C:\Users\38606.julia\packages\Optim\TNmSw\src\multivariate\solvers\zeroth_order\nelder_mead.jl:13
[2] initial_state(::NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Nothing}, ::NonDifferentiable{Float64,VectorOfArray{Float64,3,Array{Array{Float64,2},1}}}, ::VectorOfArray{Float64,3,Array{Array{Float64,2},1}}) at C:\Users\38606.julia\packages\Optim\TNmSw\src\multivariate\solvers\zeroth_order\nelder_mead.jl:155
[3] optimize(::NonDifferentiable{Float64,VectorOfArray{Float64,3,Array{Array{Float64,2},1}}}, ::VectorOfArray{Float64,3,Array{Array{Float64,2},1}}, ::NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Nothing}) at C:\Users\38606.julia\packages\Optim\TNmSw\src\multivariate\optimize\optimize.jl:33
[4] optimize(::Function, ::VectorOfArray{Float64,3,Array{Array{Float64,2},1}}; inplace::Bool, autodiff::Symbol, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\38606.julia\packages\Optim\TNmSw\src\multivariate\optimize\interface.jl:64
[5] optimize at C:\Users\38606.julia\packages\Optim\TNmSw\src\multivariate\optimize\interface.jl:58 [inlined]
[6] my_function() at .\In[7]:4
[7] top-level scope at In[8]:1
[8] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091

Tamas_Papp · September 29, 2020, 12:47pm

Why not just use a vector argument, and reshape it in get_f? Which itself could be callable with information about the dimensions. Roughly something like (untested code follows):

struct MyObjective
    d::Int
end

function (obj::MyObjective)(a)
    A = reshape(a, obj.d, obj.d)
    ...
end

optimize(MyObjective(4), zeros(16))

stevengj · September 29, 2020, 12:49pm

optimize(a -> get_f(reshape(a,4,4)),vec(A0)), maybe?

F-YF · October 4, 2020, 6:56am

Thank you for your useful reply！ But I still have two questions. Why do we have to reshape the variable A? When my objective function is multiple matrices, vectorization of variables with vec will cause problems, Is there any alternative that can replace vec?

Optimize has an operation that initializes the variable

zero(vec([[1 2;3 4],[1 2 3;4 5 6]]))

MethodError: no method matching zero(::Type{Array{Int64,2}})
Closest candidates are:
zero(!Matched::Type{Dates.DateTime}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Dates\src\types.jl:404
zero(!Matched::Type{Pkg.Resolve.VersionWeight}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Pkg\src\Resolve\versionweights.jl:15
zero(!Matched::Type{Dates.Time}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Dates\src\types.jl:406
…
Stacktrace:
[1] zero(::Array{Array{Int64,2},1}) at .\abstractarray.jl:975
[2] top-level scope at In[9]:1
[3] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091

Tamas_Papp · October 4, 2020, 8:26am

You can split an input vector into two vectors, and reshape those. Eg something like

julia> a = 1:10
1:10

julia> reshape(@view(a[1:6]), 2, 3), reshape(@view(a[7:end]), 2, 2)
([1 3 5; 2 4 6], [7 9; 8 10])

Most (all?) multivariate optimization routines operate \mathbb{R}^n, so you have to map back and forth. This is a normal part of numerical optimization.

F-YF · October 4, 2020, 12:16pm

This is the same as using vec directly, and optimize will not support this type of input. I mean that my function variable is multiple matrices, not one vector.Since the input requirement for optimize requires that the input variables should be stored in an array, I would organize the individual matrices into an array of inputs. The general method results in the following error:

F-YF:

Why do we have to reshape the variable A? When my objective function is multiple matrices, vectorization of variables with vec will cause problems, Is there any alternative that can replace vec?

Optimize has an operation that initializes the variable

zero(vec([[1 2;3 4],[1 2 3;4 5 6]]))

MethodError: no method matching zero(::Type{Array{Int64,2}})
Closest candidates are:
zero(!Matched::Type{Dates.DateTime}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Dates\src\types.jl:404
zero(!Matched::Type{Pkg.Resolve.VersionWeight}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Pkg\src\Resolve\versionweights.jl:15
zero(!Matched::Type{Dates.Time}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Dates\src\types.jl:406
…
Stacktrace:
[1] zero(::Array{Array{Int64,2},1}) at .\abstractarray.jl:975
[2] top-level scope at In[9]:1
[3] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091

So I found VectorOfArray. It supports the following operations:

zero(VectorOfArray([[1 2;3 4],[1 2 3;4 5 6]]))
julia>>VectorOfArray{Int64,3}:
2-element Array{Array{Int64,2},1}:
[0 0; 0 0]
[0 0 0; 0 0 0]

And then, going back to my original question,

Tamas_Papp · October 4, 2020, 12:39pm

On the contrary, a vector is pretty much the only format most multivariate optimizers support.

Again, you just need a small wrapper function to convert to the input your objective needs.

stevengj · October 4, 2020, 5:13pm

In practice, yes. In principle, all of these could broaden their arguments to accept AbstractArray (or even to duck-type, I guess).

Tamas_Papp · October 5, 2020, 7:07am

You mean to the mapping themselves? Internally all the algorithms are defined on \mathbb{R}^n anyway, so I expect that some conversion to <: AbstractVector would be needed for all the algorithms to work out (eg think about a dogleg step in trust region methods).

juthohaegeman · October 5, 2020, 10:34am

OptimKit.jl supports variables of arbitrary type without the need for reshaping and all that, but is otherwise much more bare bones and limited. You have to explicitly specify the gradient (you could still yourself use AD to compute this) and there is only a small number of optimization methods available. But it might be sufficient for your problem.

stevengj · October 5, 2020, 3:25pm

Even algorithms that construct linear operators, e.g. BFGS, could I guess be formulated in terms of multi-linear algebra, but I agree that such algorithms would probably want to call vec internally for convenience. There are other algorithms, however, such as nonlinear conjugate-gradient, that only require a vector space with an inner product. (Dog-leg steps fall into the latter camp, no?)

juthohaegeman · October 5, 2020, 8:28pm

The variables of a smooth optimization problem can be elements of arbitrary smooth manifolds, they don’t even need to be vectors. Gradient based optimization then requires a metric, to map the partial derivatives of a scalar function on that manifold (a vector in the cotangent space) to an element of the tangent space (a direction in which to move on the manifold), together with a retraction to actually take a finite step in such a given direction. I think that’s the most general formulation and that’s the one I was aiming for with OptimKit.jl.

Tamas_Papp · October 6, 2020, 7:18am

Yes, I think you are right. But I am not sure that simply extending to arrays (with an Euclidean metric I guess) is worth it.

I think that if one wants to generalize, OptimKit.jl has the right level of generality.

That said, for the problem in this topic, I would still consider split & reshape, as it is very easy to do.

antoine-levitt · October 6, 2020, 7:29am

I’m pretty sure Optim supports general AbstractArrays. See eg this which optimizes over complex n x m matrices: https://github.com/JuliaNLSolvers/Optim.jl/blob/master/test/multivariate/manifolds.jl. If something does not work please open an issue.

F-YF · October 10, 2020, 11:49am

Can you read this discussion? I don’t know if Optim doesn’t support general AbstractArrays very well, or if my function itself defines the problem.
https://discourse.julialang.org/t/no-method-matching-isless-array-float64-2-array-float64-2/48086

Topic		Replies	Views
optim arguments: higher dimensional arrays and gradient specification Optimization (Mathematical) question , optim , optimization	10	2301	January 26, 2018
Strange errors occurred when I optimized with Zygote and Optim General Usage optim , zygote	3	758	October 4, 2020
Multivariable optimization of matrices with "no method matching zero(::Type{Matrix{Float64}})" Optimization (Mathematical) question , optim	4	62	November 4, 2024
Optim: MethodError: no method matching zero Optimization (Mathematical)	4	913	May 8, 2024
Vectorize optimize General Usage optim , optimization , vector	6	921	December 14, 2020

Does Optim not support function optimization with matrix variables?

Related topics