Multivariable optimization of matrices with "no method matching zero(::Type{Matrix{Float64}})"

ctucker · November 3, 2024, 7:47pm

Hello,

I am trying to maximize a message-passing Conditional Random Field using Optim.jl, and since I am optimizing for matrices I have to supply the gradient functions myself. Each of the individual functions work when called with test data. When I call optimize, I get the following error:

MethodError: no method matching zero(::Type{Matrix{Float64}})

I’ve tried all the suggestions I have found here and on StackExchange to no avail.

Any pointers are very appreciated!

using CSV, DataFrames, Optim, LogExpFunctions, BenchmarkTools

original_feature_table = CSV.read("./model/feature-params.txt", CSV.Tables.matrix; header=false, transpose=true);
original_transition_table = CSV.read("./model/transition-params.txt", CSV.Tables.matrix; header=false, transpose=true);

train_count = 50
train_actuals = get_actuals(count)
train_inputs = get_inputs(count)

function mp_calc_feature_difference(input, actual, feature_table, current_difference=zeros(321,10))
	if isempty(input)
		return current_difference
	else
		in_sequence = copy(input)
		actual_sequence = copy(actual)
		actual_character = popfirst!(actual_sequence)
		
		feature = popfirst!(in_sequence)
		feature_potentials = feature .* feature_table

		feature_Z = logsumexp(sum(feature_potentials, dims=1))

		unnormalized_feature_energy = sum(feature_potentials, dims=1)[actual_character]

		probability = exp(unnormalized_feature_energy - feature_Z)
		current_difference[:,actual_character] += ones(321) - probability * feature
		mp_calc_feature_difference(in_sequence, actual_sequence, feature_table, current_difference)
	end
end

function feature_gradient(inputs, actuals, feature_table)
	count = length(inputs)
	grad_matrix = zeros(321,10)

	for i in 1:count
		grad_matrix += mp_calc_feature_difference(inputs[i], actuals[i], feature_table)
	end

	return grad_matrix / count
end

function mp_calc_transition_difference(actual, transition_table, previous_character=0, current_difference=zeros(10,10))
	if isempty(actual)
		return current_difference
	else	
		actual_sequence = copy(actual)

		current_character = popfirst!(actual_sequence)

		if previous_character == 0
			mp_calc_transition_difference(actual_sequence, transition_table, current_character, current_difference)
		else

			trans_potentials = transition_table[previous_character, :]
			unnormalized_trans_energy = trans_potentials[current_character]
			trans_Z = sum(trans_potentials)
	
			probability = exp(unnormalized_trans_energy - trans_Z)
			current_difference[previous_character, current_character] += 1 - probability
	
			mp_calc_transition_difference(actual_sequence, transition_table, current_character, current_difference)
		end
	end
end

function transition_gradient(actuals, transition_table)
	count = length(actuals)
	grad_matrix = zeros(10,10)

	for i in 1:count
		grad_matrix += mp_calc_transition_difference(actuals[i], transition_table)
	end

	return grad_matrix / count
end

function g!(G, x, p)
	
	G[1] = -1 * feature_gradient(p[1], p[2], x[1])
	G[2] = -1 * transition_gradient(p[2], x[2])

end

f(x, p) = -1 * mp_avg_loglikelihood(p[1], p[2], x[1], x[2])

u0 = [original_feature_table, original_transition_table]
params = [train_inputs, train_actuals]

optimize(x -> f(x, params), x-> g!(x, params), u0)

odow · November 3, 2024, 8:03pm

Hi @ctucker, welcome to the forum.

I ca’t run your code without the dataset, so I don’t know the exact reason. But your code seems to pass u0 as a vector of matrices? Optim assumes that the input x is a single vector.

ctucker · November 3, 2024, 8:59pm

Ah, that’s definitely something that was not clear to me! Looking at Does Optim not support function optimization with matrix variables?, I can hopefully shove both matrices into the vector and reshape them.

Thank you so much for the quick response @odow , and I will report back.

gdalle · November 3, 2024, 10:29pm

For cobtext, this error means that Julia tries to retrieve the zero element of the type Matrix{Float64}. At some point it must be creating zero(u0), which falls back on zero(eltype(u0)).
If u0 was a Vector{Float64} for instance, this would work, because Julia knows zero(Float64) = 0.0. But here you supply u0 as a Vector{Matrix{Float64}}, and Julia can’t decide what the right zero for the whole type Matrix{Float64} should be (it depends on the size).

ctucker · November 4, 2024, 4:02am

Thank you @odow, I was able to get everything working from your pointer! Thanks also @gdalle for the clarifying comments on what was being communicated in the error message.

I’ve been helped by answers that each of you have provided on past questions, and really appreciate all the effort and support you put in!

Topic		Replies	Views
Optim: MethodError: no method matching zero Optimization (Mathematical)	4	913	May 8, 2024
Does Optim not support function optimization with matrix variables? General Usage question , optim , optimization	14	2446	October 10, 2020
Strange errors occurred when I optimized with Zygote and Optim General Usage optim , zygote	3	758	October 4, 2020
Optimizing matrix-valued parameters of an ODE General Usage diffeq , flux , zygote	1	394	May 20, 2021
Using gradient function with Optim New to Julia question	4	1718	September 18, 2017

Multivariable optimization of matrices with "no method matching zero(::Type{Matrix{Float64}})"

Related topics