LSTM Method Error - Time Series

sherlock_holmes · January 3, 2022, 4:01pm

I’m trying to train an LSTM model to predict number of real roots of polynomials. x_train and y_train include array of arrays such as [[-204, 20, 13, 1, 0]] which are coefficients of polynomials. x_test and y_test include number of real roots of each polynomial such 1,2,5… I’m stuck in this error for days. I’m trying to give input to LSTM properly but it seems I cannot do it. I’m trying to implement an LSTM model for forecasting like on this site: https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/

Here is my code:

using Flux: @epochs, throttle
using Flux

function input()
	## x_train
	lines = Tuple(readlines("/home/user/Desktop/train_x_data.txt"))
	x_train = []

	for i in lines
		push!(x_train, convert(Vector{Float32},eval(Meta.parse(i))))
	end

	x_train = Vector{Vector{Float32}}(x_train)

	## y_train
	lines = Tuple(readlines("/home/user/Desktop/train_y_data.txt"))
	y_train = []

	for i in lines
		push!(y_train, eval(Meta.parse(i)))
	end
	
	y_train = convert(Vector{Float32}, y_train)

	## x_test
	lines = Tuple(readlines("/home/user/Desktop/test_x_data.txt"))
	x_test = []

	for i in lines
		push!(x_test, convert(Vector{Float32},eval(Meta.parse(i))))
	end

	x_test = Vector{Vector{Float32}}(x_test)

	## y_test
	lines = Tuple(readlines("/home/user/Desktop/test_y_data.txt"))
	y_test = []

	for i in lines
		push!(y_test, eval(Meta.parse(i)))
	end
	
	y_test = convert(Vector{Float32}, y_test)

	return x_train, x_test, y_train, y_test
end

function LSTM_model(N,num_of_classes)
	model = Chain(LSTM(N,200),
		        Dropout(0.2),
		        LSTM(200,200),
		        Dropout(0.1),
		        Dense(200,101),
		        Dropout(0.1),
		        Dense(101,num_of_classes))
	return model
end

function eval_model(x, model)
	output = [model(x) for x in x][end]
    reset!(model)
    output
end


function main()
	num_of_classes = 101
	num_epochs = 50

	x_train, x_test, y_train, y_test = input()

	N = size(x_train)[1]
	model = LSTM_model(N,num_of_classes)
	loss(x, y) = sum(Flux.Losses.binarycrossentropy(eval_model(x,model), y))
	ps = Flux.params(model)

	# use the ADAM optimizer. It's a pretty good one!
	opt = Flux.ADAM(0.001)

	evalcb = () -> @show testloss()
    @info("Training...")

 	# callback function during training
	evalcb() = @show(sum(loss.(x_test, y_test)))
         # I get error at below line
	@epochs num_epochs Flux.train!(loss, ps, zip(x_train, y_train), opt, cb = Flux.throttle(evalcb, 1))

	# after training, evaluate the loss
	println("Test loss after = ", sum(loss.(x_test, y_test)))
	
end

main()

Error Message I Get

ERROR: LoadError: MethodError: no method matching (::Flux.LSTMCell{Matrix{Float32}, Vector{Float32}, Tuple{Matrix{Float32}, Matrix{Float32}}})(::Tuple{Matrix{Float32}, Matrix{Float32}}, ::Float32)
Closest candidates are:
  (::Flux.LSTMCell{A, V, <:Tuple{AbstractMatrix{T}, AbstractMatrix{T}}})(::Any, ::Union{AbstractVector{T}, AbstractMatrix{T}, Flux.OneHotArray}) where {A, V, T} at ~/.julia/packages/Flux/BPPNj/src/layers/recurrent.jl:157

Thanks in advance!

ToucheSir · January 3, 2022, 10:53pm

All of the type params are obscuring the important part of the MethodError. Here’s what it says with them removed:

ERROR: LoadError: MethodError: no method matching (::Flux.LSTMCell)(_, ::Float32)

So instead of being passed an array, the LSTM is getting individual numbers.

With this, you can work backwards. First, make sure x in eval_model is an array of arrays as expected by Flux. I’d also rename one of the xs in [model(x) for x in x] to avoid confusion.

If everything looks good in eval_model, then move onto loss. Here you’ll want to make sure x and y are in the correct shape. It may be that train! is dividing your input data up in an undesirable way, so if it is I’d recommend using a custom training loop: Training · Flux.

sherlock_holmes · January 3, 2022, 10:57pm

Thank you very much for your answer, I’ll apply your advices now.

Hugo · December 7, 2023, 1:37am

Hi, @sherlock_holmes

I read your code and I found the blend and order of the layers in your model intriguing. I assume these were not randomly chosen; there is a reason for Dropout() and Dense() be placed where it is.

Would you mind explaining your reasoning a bit? I’m also using an LSTM for time-series prediction, but I’m so green to all of this. If you have a resource you used and proved helpful, I’d appreciate if you could share it.