This is my first significant Flux program, be gentle
I have a model
m. It takes two vectors as input (via a
Parallel(vcat...) layer), the first of which is a vector of floats and the second is a one-hot. I can get it to work if I loop through the features one by one. But I’m trying to do a batch evaluation. So I generate two matrices of features,
m1 = [f1 f2 f3 ...] where
f1 etc are the individual vectors of reals, and
m2 = [b1 b2 b3 ...] where
b1 etc are the one-hot vectors.
When I do the following:
I get what I expect, a vector output of the right length.
When I do:
I get… a vector output, of the same length. Which is to say, I don’t get individual vectors corresponding to each of the features. I just get a single vector. And it doesn’t correspond to any of the vectors I would expect for the individual inputs.
I’m doing something wildly wrong, yes? Because in my training loop, the loss function gets called with
m2 which propagates to the
m((m1, m2)) call, and then calculates a loss with the result, which is nonsensical. And, indeed, it doesn’t converge – unless I train with exactly one example in the training set.
But I don’t have enough experience to see what I’m doing wrong, this appears to be what is recommended in the Flux documentation.