Problem using Optim.optimize

longemen3000 · January 30, 2020, 5:18am

Well, in that case, let’s move to give optim a proper function with your gradient. for in this case I recommend something like this :

function fg!(F, G, x)
    f, gradf = stage2_rbf(x) 
    if !(G == nothing)
        for i=1:length(x) 
          G[i] = gradf[i, 1]
        end
    end
    if !(F == nothing) 
        return f[1,1]
    end
end
objective = OnceDifferentiable(only_fg!(fg!), x)
optimize(objective, x, BFGS())

Also, there are a lot of opportunities of improvement in that code, at least 100x at a glance, if you post this code here (other post) with the title: help me optimize my code
Edit:yeah, one end was missing

Ujku_KU · January 30, 2020, 5:32am

@longemen3000 I think you are missing one ‘end’

Ujku_KU · January 30, 2020, 5:37am

@longemen3000 I thank you very much for everything Andres, I did not know that I could ask help to optimize my code. I will do that request once I make the code run.

longemen3000 · January 30, 2020, 5:40am

Check this for example

Ujku_KU · January 30, 2020, 5:56am

@longemen3000 just to confirm for sure the dimensions of ‘f’ are 1x1 and the dimensions of ‘gradf’ are 784x1.

Ujku_KU · January 30, 2020, 6:04am

@longemen3000 I did run the code using only your last suggestion (NOT using the suggestion to return f[1, 1]) and I got this error:

ERROR: LoadError: UndefVarError: only_fg! not defined
Stacktrace:
 [1] myKPCA(::Array{Float64,2}, ::Int64, ::Array{Int64,1}, ::UnitRange{Int64}) at /home/xxxx_xx/.julia/dev/PreImageProblems/src/pre_image_problem.jl:102
 [2] top-level scope at none:0
in expression starting at /home/xxxx_xx/.julia/dev/PreImageProblems/src/polyDataDenoisExp.jl:239

I was thinking also that function only_fg! wasn’t defined, but I thought maybe it is part of OnceDifferentiable.

longemen3000 · January 30, 2020, 6:19am

Change only_fg! For Optim.only_fg!, or NLSolversBase.only_fg!, maybe is a not exported name

Ujku_KU · January 30, 2020, 6:29am

@longemen3000 I did Optim.only_fg!(fg!) and it made the last bug go away but now I have another bug:

ERROR: LoadError: MethodError: no method matching stage2_rbf(::LinearAlgebra.Transpose{Float64,Array{Float64,1}})
Closest candidates are:
  stage2_rbf(::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,2}, ::Float64; nargout) at /home/xxxx_xx/.julia/dev
/PreImageProblems/src/pre_image_problem.jl:313
  stage2_rbf(::Array{Float64,1}, ::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,2}, ::Float64; nargout) at /home/xxxx_xx/.julia/dev/PreImageProblems/src/pre_image_p
roblem.jl:208
Stacktrace:
 [1] fg!(::Float64, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/xxxx_xx/.julia/dev/PreImageProblems/src/pre_image_problem.
jl:189
 [2] (::getfield(NLSolversBase, Symbol("##61#62")){NLSolversBase.InplaceObjective{Nothing,typeof(PreImageProblems.fg!),Nothing,Nothing,Nothing},Float64})(::LinearAlgebra.Transpose{Float64,Array
{Float64,1}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/xxxx_xx/.julia/packages/NLSolversBase/NsXIC/src/objective_types/incomplete.jl:45
 [3] value_gradient!!(::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/xxxx_xx/.julia/packages/NLSolversBase/NsXIC/src/interface.jl:82
 [4] initial_state(::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}, ::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/xxxx_xx/.julia/packages/Optim/EhyUl/src/multivariate/solvers/first_order/bfgs.jl:66
 [5] optimize(::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}) at /home/xxxx_xx/.julia/packages/Optim/EhyUl/src/multivariate/optimize/optimize.jl:33 (repeats 2 times)
 [6] myKPCA(::Array{Float64,2}, ::Int64, ::Array{Int64,1}, ::UnitRange{Int64}) at /home/xxxx_xx/.julia/dev/PreImageProblems/src/pre_image_problem.jl:103
 [7] top-level scope at none:0
in expression starting at /home/xxxx_xx/.julia/dev/PreImageProblems/src/polyDataDenoisExp.jl:239

Ujku_KU · January 30, 2020, 6:41am

@longemen3000 in your code you use:

function fg!(F, G, x)
    f, gradf = stage2_rbf(x)
...
end

but I do not have any function defined like that:

function stage2_rbf(x)
....
end

Ujku_KU · January 30, 2020, 6:43am

@longemen3000 should I replace:

function fg!(F, G, x)
    f, gradf = stage2_rbf(x)
...
end

with:

function fg!(F, G, x)
    f, gradf = stage2_rbf(x, X, K, cpK, alphaA, rho; nargout=2)
...
end

and if I have to do this, is it that I should change the function like this:

function fg!(F, G, x, X, K, cpK, alphaA, rho; nargout=2)
    f, gradf = stage2_rbf(x, X, K, cpK, alphaA, rho; nargout=2)
...
end

longemen3000 · January 30, 2020, 5:37pm

Is not necessary to add all the arguments, in Julia the closures work very well, but yes, an error on my part

function fg!(F, G, _x)
    f, gradf = stage2_rbf(_x, X, K, cpK, alphaA, rho; nargout=2) #uses the already defined variables
...
end

longemen3000 · January 30, 2020, 5:40pm

Also, for speeding your code a little,

    for j in (i+1):N
          @views  norm2[i, j] = transpose(X[i, :] - X[j, :]) * (X[i, :] - X[j, :])
            norm2[j, i] = norm2[i, j]
        end

When you call X[:, i], julia creates copies. The macro @view for a single value, and @views for an expression, allows to not allocate more memory.
Also, your loops are big and the size is known, try adding the @inbounds macro before each for loop, to skip the bounds checking.

Ujku_KU · February 3, 2020, 2:54am

@longemen3000 Hi Andres and thank you very much for everything.

I did as you suggested for the stage2_rbf but still it came up another bug:

ERROR: LoadError: UndefVarError: X not defined
Stacktrace:
 [1] fg!(::Float64, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/xxxx_xx/.julia/dev/PreImageProblems/src/pre_image_problem.
jl:195
 [2] (::getfield(NLSolversBase, Symbol("##61#62")){NLSolversBase.InplaceObjective{Nothing,typeof(PreImageProblems.fg!),Nothing,Nothing,Nothing},Float64})(::LinearAlgebra.Transpose{Float64,Array
{Float64,1}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/xxxx_xx/.julia/packages/NLSolversBase/NsXIC/src/objective_types/incomplete.jl:45
 [3] value_gradient!!(::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{
Float64,1}}) at /home/xxxx_xx/.julia/packages/NLSolversBase/NsXIC/src/interface.jl:82
 [4] initial_state(::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}, ::OnceDifferentiable{
Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}) at /home/xxxx_xx/.julia/packag
es/Optim/EhyUl/src/multivariate/solvers/first_order/bfgs.jl:66
 [5] optimize(::OnceDifferentiable{Float64,LinearAlgebra.Transpose{Float64,Array{Float64,1}},LinearAlgebra.Transpose{Float64,Array{Float64,1}}}, ::LinearAlgebra.Transpose{Float64,Array{Float64,1}}, ::BFGS{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Nothing,Nothing,Flat}, ::Optim.Options{Float64,Nothing}) at /home/xxxx_xx/.julia/packages/Optim/EhyUl/src/multivariate/optimize/optimize.jl:33 (repeats 2 times)
 [6] myKPCA(::Array{Float64,2}, ::Int64, ::Array{Int64,1}, ::UnitRange{Int64}) at /home/ujku_ku/.julia/dev/PreImageProblems/src/pre_image_problem.jl:103
 [7] top-level scope at none:0
in expression starting at /home/xxxx_xx/.julia/dev/PreImageProblems/src/polyDataDenoisExp.jl:239

and the line 195 in pre_image_problem is the f, gradf = stage2_rbf(_x, X, K, cpK, alphaA, rho; nargout=2) inside the function:

function fg!(F, G, _x)
    f, gradf = stage2_rbf(_x, X, K, cpK, alphaA, rho; nargout=2) #uses the already defined variables
...
end

Maybe I have to explain how I included your code in my code:

function myKPCA(X::Array{Float64,2},
                rho_m::Int64,
                nL_vector::Array{Int64,1},
                whidx::UnitRange{Int64})
    # myKPCA KPCA with Gaussian RBF kernel
    #
    # 2 methods: (1) Scholkopf, and (2) Countour gradient


    N, n = size(X) # N: sample size; n: dimension
    oneOverN = 1/N
    ones1n = ones(1, n)
    Ntest = length(whidx) # whidx = 1:size(X, 1)

    # inter-distances
    norm2 = zeros(N, N)
    for i in 1:N
        for j in (i+1):N
            norm2[i, j] = transpose(X[i, :] - X[j, :]) * (X[i, :] - X[j, :])
            norm2[j, i] = norm2[i, j]
        end
    end

    # rho = mean_component_variance * rho_m
    mean_comp_var = mean(var(X, dims=1))
    rho = mean_comp_var * size(X, 2) * rho_m # the same with rho = sum(var(X, dims=1)) * rho_m


    # compute the kernel matrix
    K = zeros(N, N)
    for i in 1:N
        for j in i:N
            K[i, j] = exp(-norm2[i, j]/rho)
            K[j, i] = K[i, j]
        end
    end

    # centering the kernel matrix
    unit = ones(N, 1)
    unit2 = ones(N, N) # buffer
    Kone = K * unit
    oneKone = transpose(unit) * Kone # buffer
    oneKoneUnit2 = oneOverN^2 .* oneKone .* unit

    K_tilde = K - unit2 * K / N - K * unit2 / N + unit * oneKone * transpose(unit) / N^2
    K_tilde = min.(K_tilde, transpose(K_tilde))
    # K_tilde = min.(K_tilde, transpose(K_tilde), dims=1)

    # find eigenvalues and eigenvectors of the centered kernel matrix
    F = LinearAlgebra.eigen(K_tilde)
    evecs = F.vectors
    evals = real(F.values)



    # denoising by 2 different methods
    # loop over different choices of the number of rettained eigenvectors
    for vv in 1:length(nL_vector)
        @show vv
        nL = nL_vector[vv] # 'nL_vector' is the vector that contains the number of the principal vectors used in every run

        large_idx = N - nL .+ (1:nL) # grab the last nL index to be used later to grab the last nL eigenvalues in 'evals'
        alphaA = evecs[:, large_idx] # grab the principal eigenvectors
        levals = evals[large_idx] # grab the eigenvalues corresponding to the primcipal eigenvectors
        alphaA = alphaA * Diagonal(levals .^ (-0.5)) # normalization of principal eigenvectors (e-vectors in feature space)

        Xp1 = zeros(Ntest, n)

        for ii in 1:Ntest
            i = whidx[ii]

            # x = X[i, :]
            x = transpose(X[i, :])
            Kx = zeros(N, 1)
            for j in 1:N
                Kx[j, i] = exp(-(x - transpose(X[j, :])) * transpose(x - transpose(X[j, :]))/rho)
            end
            Kx_tilde = Kx - 1/N .* K * unit - 1/N .* unit * transpose(unit) * Kx + 1/(N^2) .* (transpose(unit) * K * unit) .* unit
            cpK = (transpose(Kx_tilde) * alphaA)

            # optimization

            # xp1 = optimize(x -> stage2_rbf(x, X, K, cp, alphaA, rho; nargout=2), x, BFGS()) # 
            f_stage2_rbf(x) = stage2_rbf(x, X, K, cpK, alphaA, rho; nargout=2)
      
            objective = OnceDifferentiable(Optim.only_fg!(fg!), x)
            xp1 = optimize(objective, x, BFGS())


            # xp1 = optimize(f_stage2_rbf, x, BFGS())
            Xp1[ii, :] = transpose(xp1)
        end

    return Xp1
end

function fg!(F::Any,
             # F::Float64,
             # F::Array{Float64,2},
             G::Any,
             # G::Array{Float64,1},
             # _x::Array{Float64,1}
             _x::Any)
    f, gradf = stage2_rbf(_x, X, K, cpK, alphaA, rho; nargout=2)
    if !(G == nothing)
        for i=1:length(x)
            G[i] = gradf[i, 1]
        end
    end
    if !(F == nothing)
        return f[1,1]
    end
end

Do you think that it is better to use function

f_stage2_rbf(x) = stage2_rbf(x, X, K, cpK, alphaA, rho; nargout=2)

instead of

stage2_rbf(x, X, K, cp, alphaA, rho; nargout=2)

in function fg! like this:

function fg!(F::Any,
             # F::Float64,
             # F::Array{Float64,2},
             G::Any,
             # G::Array{Float64,1},
             # _x::Array{Float64,1}
             _x::Any)
    f, gradf = f_stage2_rbf(_x)
    if !(G == nothing)
        for i=1:length(x)
            G[i] = gradf[i, 1]
        end
    end
    if !(F == nothing)
        return f[1,1]
    end
end

Once more thank you very much for your patience and support.

Cheers.

Ergnoor

longemen3000 · February 3, 2020, 3:03am

give me a moment, i will modify your code with the correction (and other things)

longemen3000 · February 3, 2020, 3:32am

try this code

using LinearAlgebra #to use the dot function
function myKPCA(X::Matrix{Float64}, #Matrix{Float64} is equivalent to Array{Float64,2}
    rho_m::Int, #Int is equivalent to Int64
    nL_vector::Vector{Int}, #Vector{T} is equivalent to Array{T,1}
    whidx::UnitRange{Int64})
# myKPCA KPCA with Gaussian RBF kernel
#
# 2 methods: (1) Scholkopf, and (2) Countour gradient


N, n = size(X) # N: sample size; n: dimension
oneOverN = 1/N
ones1n = ones(1, n)
Ntest = length(whidx) # whidx = 1:size(X, 1)

# inter-distances
norm2 = zeros(N, N)
cache_x = zeros(N) # a one dimentional vector, of float64, to cache the operations
for i in 1:N
    view_xi = @view X[i, :] #a view of the vector X in those indices, it allocates less
    for j in (i+1):N  
        view_xj = @view X[j, :] #a view of the vector X in those indices, it allocates less
        cache_x .= view_xi .- view_xj #the dots are element-wise operations. the .= is inplace assignment
        norm2[i, j] = dot(cache_x,cache_x) #better notation, also cache_x⋅cache_x
        norm2[j, i] = norm2[i, j]
    end
end

# rho = mean_component_variance * rho_m
mean_comp_var = mean(var(X, dims=1))
rho = mean_comp_var * size(X, 2) * rho_m # the same with rho = sum(var(X, dims=1)) * rho_m


# compute the kernel matrix
K = zeros(N, N)
for i in 1:N, j in i:N #multiple loops in one line are possible
    K[i, j] = exp(-norm2[i, j]/rho)
    K[j, i] = K[i, j]
end

# centering the kernel matrix
unit = ones(N, 1)
unit2 = ones(N, N) # buffer
Kone = K * unit
oneKone = transpose(unit) * Kone 
oneKoneUnit2 = oneOverN^2 .* oneKone .* unit

K_tilde = K - unit2 * K / N - K * unit2 / N + unit * oneKone * transpose(unit) / N^2
K_tilde = min.(K_tilde, transpose(K_tilde))
# K_tilde = min.(K_tilde, transpose(K_tilde), dims=1)

# find eigenvalues and eigenvectors of the centered kernel matrix
F = LinearAlgebra.eigen(K_tilde)
evecs = F.vectors
evals = real(F.values)

# denoising by 2 different methods
# loop over different choices of the number of rettained eigenvectors
for vv in 1:length(nL_vector)
    @show vv
    nL = nL_vector[vv] # 'nL_vector' is the vector that contains the number of the principal vectors used in every run
    large_idx = N - nL .+ (1:nL) # grab the last nL index to be used later to grab the last nL eigenvalues in 'evals'
    alphaA = evecs[:, large_idx] # grab the principal eigenvectors
    levals = evals[large_idx] # grab the eigenvalues corresponding to the primcipal eigenvectors
    alphaA = alphaA * Diagonal(levals .^ (-0.5)) # normalization of principal eigenvectors (e-vectors in feature space)
    Xp1 = zeros(Ntest, n)
    for ii in 1:Ntest
        i = whidx[ii]
        # x = X[i, :]
        x = transpose(X[i, :])
        Kx = zeros(N, 1)
        for j in 1:N
            Kx[j, i] = exp(-(x - transpose(X[j, :])) * transpose(x - transpose(X[j, :]))/rho)
        end
        Kx_tilde = Kx - 1/N .* K * unit - 1/N .* unit * transpose(unit) * Kx + 1/(N^2) .* (transpose(unit) * K * unit) .* unit
        cpK = (transpose(Kx_tilde) * alphaA)

        # optimization

        # xp1 = optimize(x -> stage2_rbf(x, X, K, cp, alphaA, rho; nargout=2), x, BFGS()) # 
        f_stage2_rbf(x) = stage2_rbf(x, X, K, cpK, alphaA, rho; nargout=2)
        
#==in programming, this is called a closure, because it captures external variables to itself. in this case,         f_stage2_rbf(x). f_stage2_rbf(x) is also a clossure that captures X,K,cpK and others. this is important because this function has to be here to properly capture the variables inside the loop

==#
        function fg!(F, G, _x) 
          f, gradf = f_stage2_rbf(_x)
            if !(G == nothing)
                for i=1:length(x)
                    G[i] = gradf[i, 1]
                end
            end
            if !(F == nothing)
                return f[1,1]
            end
        end
        objective = OnceDifferentiable(Optim.only_fg!(fg!), x)
        res= optimize(objective, x, BFGS()) #optim returns a compound result, storing the minimizer and minimum

        #do you want the x vector that minimizes the objective?
        xp1 = Optim.minimizer(res)
        # xp1 = optimize(f_stage2_rbf, x, BFGS())
        Xp1[ii, :] = transpose(xp1)
    end

    return Xp1
end

i added some modifications with proper comments,

Ujku_KU · February 3, 2020, 4:40am

@longemen3000 Hi Andres and thank you very much. One thing thou when I implemented your last solution only for the Optim part I did a small change and it worked:

function fg!(F, G, _x) 
     f, gradf = f_stage2_rbf(_x)
     if !(G == nothing)
        for i=1:length(x)
           G[i] = gradf[i, 1]
        end
   end
   if !(F == nothing)
       return f[1,1]
   end
end

instead of f, gradf = f_stage2_rbf(x) I put there f, gradf = f_stage2_rbf(_x) to match with function fg!(F, G, _x). Please edit this part and I have to click it as a solution.

One more question here: What is the output of optimize(objective, x, BFGS()).

I expect that to be a vector, but it seems that it is a Tuple{Array{Float64,2},Array{Float64,2}} . How can I remedy this? Is it that the first Array is the initial position and the second Array is the position after the optimization?

On the other hand when I implemented your suggestion for the optimisation in order to speed up the code I got the following error:

ERROR: LoadError: DimensionMismatch("array could not be broadcast to match destination")
Stacktrace:
 [1] check_broadcast_shape at ./broadcast.jl:456 [inlined]
 [2] check_broadcast_axes at ./broadcast.jl:459 [inlined]
 [3] check_broadcast_axes at ./broadcast.jl:462 [inlined]
 [4] instantiate at ./broadcast.jl:258 [inlined]
 [5] materialize! at ./broadcast.jl:756 [inlined]
 [6] myKPCA(::Array{Float64,2}, ::Int64, ::Array{Int64,1}, ::UnitRange{Int64}) at /home/xxxx_xx/.julia/dev/PreImageProblems/src/pre_image_problem.jl:29
 [7] top-level scope at none:0
in expression starting at /home/xxxx_xx/.julia/dev/PreImageProblems/src/polyDataDenoisExp.jl:239

which is realted with this line:

cache_x .= view_xi .- view_xj #the dots are element-wise operations. the .= is inplace assignment

Thank you very much for everything.

Cheers

Ergnoor

longemen3000 · February 3, 2020, 5:00am

check this https://julianlsolvers.github.io/Optim.jl/stable/#user/minimization/#obtaining-results
in short if you are looking for the vector that minimizes the function, use Optim.minimizer(res)

Topic		Replies	Views
Optimization, from MATLAB to Julia Optimization (Mathematical) matlab , optimization	10	888	April 13, 2024
Using sparse matrix in NLoptimization Optimization (Mathematical) question	3	1468	August 1, 2019
Method Error and Domain Error when using NLsolve and Optim New to Julia	5	1262	July 18, 2018
Optimizing Calculation in Julia compared to C (New to Julia) Performance	25	2529	January 1, 2020
Help with for-loops optimization / Translation from Matlab New to Julia matlab , optimization , for-loop , speed-optimization	7	216	August 15, 2024

Problem using Optim.optimize

Related topics