Problem using Optim.optimize

Ujku_KU · January 24, 2020, 6:51am

Hi guys, I am trying to translate a piece of code from Matlab to Julia, please see following first the original Matlab code:

options = optimset('Largescale', 'off', 'MaxFunEvals', 500, 'GradObj','on', 'Display','notify'); 
xp1=fminunc(@stage2_rbf, x, options, X, K, cp, alpha, rho);

and now the Julia version:

xp1 = optimize(x -> stage2_rbf(x, X, K, cp, alpha, rho; nargout=2), x, BFGS())

where x is a vector, X is a matrix, K is a kernel matrix and so on.

Am I doing something wrong here guys?

Thank you very much in advance for your time and support.

Cheers.

Ergnoor

baggepinnen · January 24, 2020, 8:33am

You have not told us if you get an error when running your code. We can not try it out either since you did not provide a minimal working example.

Ujku_KU · January 28, 2020, 12:29am

@baggepinnen Hi Fredrik and thank you for the interest shown to my question.

The code in general is a little bit long and hence I try to be sure every bit of it is tested before being placed in it. That means I try to replicate each bit in isolation and test if it runs in REPL. I tried to replicate this part of the code in isolation (using a simple function of two vector variables) and run it, but it always came up with some quite difficult errors for me to understand (I do not know very well Julia). And hence I decided to ask the question: Is that line of Julia code conceptually wright or not.

Hence reformulating my question: How you would have translated those two lines of Matlab code in some Julia code?

Thank you very much again for your time and support Fredrik.

Cheers.

Ergnoor

aaowens · January 28, 2020, 2:26am

Does your stage2_rbf function actually work? Hard to say without seeing any code.

If this works

fun(x) = stage2_rbf(x, X, K, cp, alpha, rho; nargout=2)
fun(x)

Then you should be able to do

xp1 = optimize(fun, x, BFGS())

Ujku_KU · January 28, 2020, 5:34am

@aaowens Hi Andrew and thank you very much for your help.

I see that you are proposing the alternative solution that I found online when I was searching to solve my problem and this confirms, I believe, that even the solution that I propose should be working. If not I will try with your proposed solution and see how that goes.

Anyway very soon I will return hopefully to confirm the solution and not with some error to debug when I will run the whole code.

Thank you very much once again for your support I appreciate it very much.

Cheers.

Ergnoor

Ujku_KU · January 30, 2020, 2:43am

Hi guys, unfortunately after removing all the other bugs I arrived at this bug that goes inside ‘gradients.jl’:

ERROR: LoadError: MethodError: no method matching -(::Tuple{Array{Float64,2},Array{Float64,2}}, ::Tuple{Array{Float64,2},Array{Float64,2}})
Stacktrace:
 [1] #finite_difference_gradient!#21(::Float64, ::Float64, ::Bool, ::Function, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::getfield(PreImageProblems, Symbol("#f_stage2_rbf#1")){Array
{Float64,2},Array{Float64,2}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::DiffEqDiffTools.GradientCache{Nothing,LinearAlgebra.Transpose{Float64,Array{Float64,1}},Array{Float64,2},Va
l{:central},Float64,Val{true}}) at /home/ujku_ku/.julia/packages/DiffEqDiffTools/uD0fb/src/gradients.jl:234
 [2] finite_difference_gradient! at /home/ujku_ku/.julia/packages/DiffEqDiffTools/uD0fb/src/gradients.jl:188 [inlined]
 [3] (::getfield(NLSolversBase, Symbol("#g!#15")){getfield(PreImageProblems, Symbol("#f_stage2_rbf#1")){Array{Float64,2},Array{Float64,2}},DiffEqDiffTools.GradientCache{Nothing,LinearAlgebra.Tr
anspose{Float64,Array{Float64,1}},Array{Float64,2},Val{:central},Float64,Val{true}}})(::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) a
t /home/ujku_ku/.julia/packages/NLSolversBase/NsXIC/src/objective_types/oncedifferentiable.jl:56
 [4] (::getfield(NLSolversBase, Symbol("#fg!#16")){getfield(PreImageProblems, Symbol("#f_stage2_rbf#1")){Array{Float64,2},Array{Float64,2}}})(::LinearAlgebra.Transpose{Float64,Array{Float64,1}}
, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/ujku_ku/.julia/packages/NLSolversBase/NsXIC/src/objective_types/oncedifferentiable.jl:60
 [5] value_gradient!!(::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/ujku_ku/.julia/packages/NLSolversBase/NsXIC/src/interface.jl:82
 [6] initial_state(::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}, ::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/ujku_ku/.julia/packages/Optim/EhyUl/src/multivariate/solvers/first_order/bfgs.jl:66
 [7] #optimize#93 at /home/ujku_ku/.julia/packages/Optim/EhyUl/src/multivariate/optimize/optimize.jl:33 [inlined]
 [8] optimize(::Function, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}) at /home/ujku_ku/.julia/packages/Optim/EhyUl/src/multivariate/optimize/interface.jl:115 (repeats 2 times)
 [9] myKPCA(::Array{Float64,2}, ::Int64, ::Array{Int64,1}, ::UnitRange{Int64}) at /home/ujku_ku/.julia/dev/PreImageProblems/src/pre_image_problem.jl:99
 [10] top-level scope at none:0
in expression starting at /home/xxxxxx_xx/.julia/dev/PreImageProblems/src/polyDataDenoisExp.jl:239

I have a minimal working example, if you are willing to have a look I can post it.

Thank you in advance for your time and support.

Cheers.

Ergnoor

longemen3000 · January 30, 2020, 4:11am

Let me look at it, maybe I can give it a shot, at first glance, the gradient is not working, maybe providing a OnceDifferentiable Struct would be better

Ujku_KU · January 30, 2020, 4:22am

@longemen3000 Hi Andres and thank you for helping me with my problem. Following is the code:


using MultivariateStats, ForwardDiff, StatsBase, MLKernels, MAT, Random, SparseArrays, Plots, Optim, LinearAlgebra, Distributions, Random

file8 = matopen("digit8.mat") # loading the dataset
FA8 = read(file8)
close(file8)
FA8_Data = FA8["true"]
rng = MersenneTwister(1234) # fixing the seed for the random generation
variance = 1 # the variance of the Normal Distribution used to generate the noise
X = FA8_Data + rand(Normal(0, sqrt(variance)), size(FA8_Data)) # adding noise to images
rho_m = 1 # the constant for the exp(-norm2(x,y)/rho_m) RBF kernel
L = [4] # the vector that contains the number of principal eigenvectors for the subspace used in each run/experiment (in our case is just one experiment with 4 principal eigenvectors)
Xp1 = myKPCA(X, rho_m, L, 1:size(X, 1))












function myKPCA(X::Array{Float64,2},
                rho_m::Int64,
                nL_vector::Array{Int64,1},
                whidx::UnitRange{Int64})
    # myKPCA KPCA with Gaussian RBF kernel
    #
    # 2 methods: (1) Scholkopf, and (2) Countour gradient


    N, n = size(X) # N: sample size; n: dimension
    oneOverN = 1/N
    ones1n = ones(1, n)
    Ntest = length(whidx) # whidx = 1:size(X, 1)

    # inter-distances
    norm2 = zeros(N, N)
    for i in 1:N
        for j in (i+1):N
            norm2[i, j] = transpose(X[i, :] - X[j, :]) * (X[i, :] - X[j, :])
            norm2[j, i] = norm2[i, j]
        end
    end

    # rho = mean_component_variance * rho_m
    mean_comp_var = mean(var(X, dims=1))
    rho = mean_comp_var * size(X, 2) * rho_m # the same with rho = sum(var(X, dims=1)) * rho_m


    # compute the kernel matrix
    K = zeros(N, N)
    for i in 1:N
        for j in i:N
            K[i, j] = exp(-norm2[i, j]/rho)
            K[j, i] = K[i, j]
        end
    end

    # centering the kernel matrix
    unit = ones(N, 1)
    unit2 = ones(N, N) # buffer
    Kone = K * unit
    oneKone = transpose(unit) * Kone # buffer
    oneKoneUnit2 = oneOverN^2 .* oneKone .* unit

    K_tilde = K - unit2 * K / N - K * unit2 / N + unit * oneKone * transpose(unit) / N^2
    K_tilde = min.(K_tilde, transpose(K_tilde))
    # K_tilde = min.(K_tilde, transpose(K_tilde), dims=1)

    # find eigenvalues and eigenvectors of the centered kernel matrix
    F = LinearAlgebra.eigen(K_tilde)
    evecs = F.vectors
    evals = real(F.values)



    # denoising by 2 different methods
    # loop over different choices of the number of rettained eigenvectors
    for vv in 1:length(nL_vector)
        @show vv
        nL = nL_vector[vv] # 'nL_vector' is the vector that contains the number of the principal vectors used in every run

        large_idx = N - nL .+ (1:nL) # grab the last nL index to be used later to grab the last nL eigenvalues in 'evals'
        alphaA = evecs[:, large_idx] # grab the principal eigenvectors
        levals = evals[large_idx] # grab the eigenvalues corresponding to the primcipal eigenvectors
        alphaA = alphaA * Diagonal(levals .^ (-0.5)) # normalization of principal eigenvectors (e-vectors in feature space)

        Xp1 = zeros(Ntest, n)

        for ii in 1:Ntest
            i = whidx[ii]

            # x = X[i, :]
            x = transpose(X[i, :])
            Kx = zeros(N, 1)
            for j in 1:N
                Kx[j, i] = exp(-(x - transpose(X[j, :])) * transpose(x - transpose(X[j, :]))/rho)
            end
            Kx_tilde = Kx - 1/N .* K * unit - 1/N .* unit * transpose(unit) * Kx + 1/(N^2) .* (transpose(unit) * K * unit) .* unit
            cpK = (transpose(Kx_tilde) * alphaA)

            # optimization

            # xp1 = optimize(x -> stage2_rbf(x, X, K, cp, alphaA, rho; nargout=2), x, BFGS()) # 
            f_stage2_rbf(x) = stage2_rbf(x, X, K, cpK, alphaA, rho; nargout=2)
            xp1 = optimize(f_stage2_rbf, x, BFGS())
            Xp1[ii, :] = transpose(xp1)
        end

    return Xp1
end



function stage2_rbf(x::Transpose{Float64,Array{Float64,1}},
                    X::Array{Float64,2},
                    K::Array{Float64,2},
                    cpK::Array{Float64,2},
                    alphaA::Array{Float64,2},
                    rho::Float64;
                    nargout=1)
    N = size(X, 1) # the number of samples
    n = size(X, 2) # the number of features for each sample
    Kx = zeros(N, 1)
    for i in 1:N
        Kx[i, 1] = exp(-(x - transpose(X[i, :])) * transpose(x - transpose(X[i, :]))/rho)
    end
    unit = ones(N, 1)
    JKx = -2/rho * (Kx * ones(1, n)) .* (unit * x - X)

    Kx_tilde = Kx - 1/N * K * unit - 1/N * unit * transpose(unit) * Kx + 1/(N^2) * (transpose(unit) * K * unit) .* unit
    JKx_tilde = JKx - 1/N * unit * transpose(unit) * JKx

    f = 1 .- 2/N * transpose(unit) * Kx + 1/(N^2) * transpose(unit) * K * unit - 2 * (transpose(Kx_tilde) * alphaA) * transpose(cpK)
    if nargout > 1 
        gradf = -2/N * transpose(transpose(unit) * JKx) - 2 * (transpose(JKx_tilde) * alphaA) * transpose(cpK)
    end

    return f, gradf
end


function stage2_rbf(x::Array{Float64,1},
                    X::Array{Float64,2},
                    K::Array{Float64,2},
                    cpK::Array{Float64,2},
                    alphaA::Array{Float64,2},
                    rho::Float64;
                    nargout=1)
    N = size(X, 1) # the number of samples
    n = size(X, 2) # the number of features for each sample
    Kx = zeros(N, 1)
    for i in 1:N
        Kx[i, 1] = exp(-transpose(x - transpose(X[i, :])) * (x - transpose(X[i, :]))/rho)
    end
    unit = ones(N, 1)
    JKx = -2/rho * (Kx * ones(1, n)) .* (unit * transpose(x) - X)

    Kx_tilde = Kx - 1/N * K * unit - 1/N * unit * transpose(unit) * Kx + 1/(N^2) * (transpose(unit) * K * unit) * unit
    JKx_tilde = JKx - 1/N * unit * transpose(unit) * JKx

    f = 1 - 2/N * transpose(unit) * Kx + 1/(N^2) * transpose(unit) * K * unit - 2 * (transpose(Kx_tilde) * alphaA) * cpK
    if nargout > 1 
        gradf = -2/N * transpose(transpose(unit) * JKx) - 2 * (transpose(JKx_tilde) * alphaA) * cpK
    end

    return f, gradf
end

I do not know how to upload the dataset ’ digit8.mat’, but it is a set of 1000 samples of handwritten number eight vectorized images (784 pixels vectorized). Hence the dimensions of the dataset is 1000x784 pixels. If you know any way how to upload please let me know so I upload them for you.

Once more thank you very much for your support.

Cheers.

Ergnoor

longemen3000 · January 30, 2020, 4:41am

What’s the return of stage2_rbf?
Seems like they return an array.
also, change the nargout block by:

if nargout > 1
  gradf =.... 
  return f, gradf
else
  return f
end

Ujku_KU · January 30, 2020, 4:50am

@longemen3000 I did those changes in my code, but still I get the same error message.

ERROR: LoadError: MethodError: no method matching -(::Tuple{Array{Float64,2},Array{Float64,2}}, ::Tuple{Array{Float64,2},Array{Float64,2}})
Stacktrace:
 [1] #finite_difference_gradient!#21(::Float64, ::Float64, ::Bool, ::Function, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::getfield(PreImageProblems, Symbol("#f_stage2_rbf#1")){Array
{Float64,2},Array{Float64,2}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::DiffEqDiffTools.GradientCache{Nothing,LinearAlgebra.Transpose{Float64,Array{Float64,1}},Array{Float64,2},Va
l{:central},Float64,Val{true}}) at /home/ujku_ku/.julia/packages/DiffEqDiffTools/uD0fb/src/gradients.jl:234
 [2] finite_difference_gradient! at /home/ujku_ku/.julia/packages/DiffEqDiffTools/uD0fb/src/gradients.jl:188 [inlined]
 [3] (::getfield(NLSolversBase, Symbol("#g!#15")){getfield(PreImageProblems, Symbol("#f_stage2_rbf#1")){Array{Float64,2},Array{Float64,2}},DiffEqDiffTools.GradientCache{Nothing,LinearAlgebra.Tr
anspose{Float64,Array{Float64,1}},Array{Float64,2},Val{:central},Float64,Val{true}}})(::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) a
t /home/ujku_ku/.julia/packages/NLSolversBase/NsXIC/src/objective_types/oncedifferentiable.jl:56
 [4] (::getfield(NLSolversBase, Symbol("#fg!#16")){getfield(PreImageProblems, Symbol("#f_stage2_rbf#1")){Array{Float64,2},Array{Float64,2}}})(::LinearAlgebra.Transpose{Float64,Array{Float64,1}}
, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/ujku_ku/.julia/packages/NLSolversBase/NsXIC/src/objective_types/oncedifferentiable.jl:60
 [5] value_gradient!!(::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/ujku_ku/.julia/packages/NLSolversBase/NsXIC/src/interface.jl:82
 [6] initial_state(::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}, ::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/ujku_ku/.julia/packages/Optim/EhyUl/src/multivariate/solvers/first_order/bfgs.jl:66
 [7] #optimize#93 at /home/ujku_ku/.julia/packages/Optim/EhyUl/src/multivariate/optimize/optimize.jl:33 [inlined]
 [8] optimize(::Function, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}) at /home/ujku_ku/.julia/packages/Optim/EhyUl/src/multivariate/optimize/interface.jl:115 (repeats 2 times)
 [9] myKPCA(::Array{Float64,2}, ::Int64, ::Array{Int64,1}, ::UnitRange{Int64}) at /home/ujku_ku/.julia/dev/PreImageProblems/src/pre_image_problem.jl:99
 [10] top-level scope at none:0
in expression starting at /home/xxxx_xx/.julia/dev/PreImageProblems/src/polyDataDenoisExp.jl:239

longemen3000 · January 30, 2020, 4:52am

Of what type is, f? Is a number or an array?. The problem seems to be that the optim backend is incapable of sustracting two values of the form (matrixA1, matrixB1).

Ujku_KU · January 30, 2020, 4:54am

@longemen3000 as i remember it was a number in a vector, or a vector with just one element.

longemen3000 · January 30, 2020, 4:56am

Try returning just the number, first(f), f[1]

Ujku_KU · January 30, 2020, 4:58am

@longemen3000 should I apply this rule even to ‘gradf’ also

longemen3000 · January 30, 2020, 5:01am

Before following that approach, can you add this after f_stage2_rbf(x) = stage2_rbf(x, X, K, cpK, alphaA, rho; nargout=2)?

println(typeof(f_stage2_rbf(x)))

To really know what the function is returning

Ujku_KU · January 30, 2020, 5:06am

@longemen3000 it printed out this:

Tuple{Array{Float64,2},Array{Float64,2}}

longemen3000 · January 30, 2020, 5:07am

That’s the main problem, your function returns a tuple of two matrices, and optim accepts an scalar function as objective, now, knowing that, lets check the sizes, change that line for:

x1a, x2a=f_stage2_rbf(x) 
println((size(x1a), size(x2a)))

Ujku_KU · January 30, 2020, 5:09am

@longemen3000 so I should try with the procedure you proposed before is it?

longemen3000 · January 30, 2020, 5:14am

Maybe, but the sizes are important, if f is a, 1x1 matrix, the return value should be f[1,1], but if f has more elements, there is a problem in the underlying algorithm. Also, try not returning gradf, as optim is not using it

Ujku_KU · January 30, 2020, 5:16am

@longemen3000 I understand what you saying but I am almost sure that ‘f’ is a 1x1 matrix, and also I am almost sure that ‘gradf’ is a 784x1 matrix. I can find for sure the exact sizes in 10or 20 minutes.

Topic		Replies	Views
Performance comparison - Flux.jl's Adam vs Jax's Adam Performance question , package , performance , flux	33	3894	November 3, 2022
Optim.jl vs scipy.optimize once again Performance optim	54	8929	June 10, 2021
Why is this optimization problem so slow in Julia? In R it took 60s Optimization (Mathematical)	31	1980	May 22, 2021
Using optim.jl ( optimization ) - like fminsearch in matlab Optimization (Mathematical) optim	48	5891	November 2, 2017
Maximum Likelihood: Normal Linear Model Optimization (Mathematical)	31	4264	December 25, 2020

Problem using Optim.optimize

Related topics