How to correctly handle batches in this model

danipozo · July 20, 2019, 12:03am

I’m trying to write a model similar to the one in Image to markup with coarse to fine attention, where a fully convolutional network is first applied to an image (resulting in a smaller image with a greater number of channels, e.g., shape (80,80,256)), and then a bidirectional LSTM is applied to each row of the resulting image.

I don’t know how to correctly handle batches since this seems to require that I write some explicit iteration, accessing the rows in a non shape-independent way:

function (m::RowEncoder)(x)
    out = zeros(m.rows, m.columns, m.features)
    for i in 1:m.rows
        out[i, :, :] = m.bidir_lstm(x[i, :, :])
        Flux.reset!(m.bidir_lstm)
    end
end

but this errors when called on a batch with shape (rows,colums,features,examples). Can I write a specific version of the function for each shape? Do I need to do this? Or is there a way to write a shape-generic function that also handles BiLSTM resets?

Thanks,
Daniel

Topic		Replies	Views
Some questions for LSTM General Usage question	6	513	July 29, 2022
LSTM training for a sequence of multiple features using a batch size 30 Machine Learning lstm	10	2465	November 22, 2023
Batch training for LSTMs in Flux or Knet New to Julia question , knet , flux	5	948	October 18, 2020
LSTM with batch vs with individual values Machine Learning flux , time-series	4	697	January 3, 2022
How to format sequential data to be used in reccurence models when batches are needed? Machine Learning question , flux	6	439	January 18, 2023

How to correctly handle batches in this model

Related topics