[DiffEqFlux.jl] DimensionMismatch("array could not be broadcast to match destination")

I’m trying to train a NeuralODE model on the MNIST dataset. When I take the gradient of loss function I get the DimensionMismatch error.

using DiffEqFlux, OrdinaryDiffEq, Flux, NNlib, MLDataUtils, Printf
using Flux: logitcrossentropy
using Flux.Data: DataLoader
using MLDatasets
using CUDA
using Random: seed!

function loadmnist(batchsize = bs, train_split = 0.9)
    # Use MLDataUtils LabelEnc for natural onehot conversion
    onehot(labels_raw) = convertlabel(LabelEnc.OneOfK, labels_raw,
    # Load MNIST
    imgs, labels_raw = MNIST.traindata();
    # Process images into (H,W,C,BS) batches
    x_data = Float32.(reshape(imgs, size(imgs,1), size(imgs,2), 1, size(imgs,3)))
    y_data = onehot(labels_raw)
    (x_train, y_train), (x_test, y_test) = stratifiedobs((x_data, y_data),
                                                         p = train_split)
    return (
        # Use Flux's DataLoader to automatically minibatch and shuffle the data
        DataLoader(gpu.(collect.((x_train, y_train))); batchsize = batchsize,
                   shuffle = true),
        # Don't shuffle the test data
        DataLoader(gpu.(collect.((x_test, y_test))); batchsize = batchsize,
                   shuffle = false)

# Main
const bs = 128
const train_split = 0.9
train_dataloader, test_dataloader = loadmnist(bs, train_split);

function DiffEqArray_to_Array(x)
    xarr = gpu(x)
    return reshape(xarr, size(xarr)[1:end-1])

down = Chain(Conv((3,3), 1=>64, pad=(0,0), relu),
        Conv((4,4), 64=>64, stride=2, pad=1, relu),
        Conv((4,4), 64=>64, stride=2, pad=1, relu)) |> gpu

convode_base= Chain(Conv((3,3), 64=>64, stride=1, pad=1, relu),
                Conv((3,3), 64=>64, stride=1, pad=1, relu),
                BatchNorm(64)) |> gpu

convode = NeuralODE(convode_base, (0.f0, 1.f0), Tsit5(),
           save_everystep = false,
           reltol = 1e-3, abstol = 1e-3,
           save_start = false) |> gpu;

fc = Chain(Dense(2304,10)) |> gpu;

model = Chain(down,
        x -> reshape(x, :, size(x, 4)),
        softmax) |> gpu;

loss(x, y) = logitcrossentropy(model(x), y)

img, lab = train_dataloader.data[1][:, :, :, 1:1], train_dataloader.data[2][:, 1:1]

When I run the following I get the DimensionMismatch error.

g = gradient(() -> loss(img, lab), params(model))

I couldn’t find what was causing this error. The forward pass model(img) works fine. The same error occurs when I try different training algorithms such as ADAM etc so I think the gradient is causing the error.

Full stack trace: ```DimensionMismatch("array could not be broadcast to match destination")S - Pastebin.com