Hi there,

I encountered an error while setting up a 3D CNN that I dont understand. The problems boils down to this:

While in 2D

```
using Flux
a, b = rand(5,10,2,10), rand(5,10,2,10)
m= Chain(Conv((3,3),2=>4,pad=(1,1)), Conv((3,3), 4=>2, pad=(1,1)))
loss(x,y) = Flux.mse(m(x),y)
@time Flux.train!(loss, params(m), [[a,b]], ADAM(0.01))
```

works without problems, in 3D

```
a2, b2 = rand(5,10,10,2,10), rand(5,10,10,2,10)
m2= Chain(Conv((3,3,3),2=>4,pad=(1,1,1)), Conv((3,3,3), 4=>2, pad=(1,1,1)))
loss2(x,y) = Flux.mse(m2(x),y)
@time Flux.train!(loss2, params(m2), [[a2,b2]], ADAM(0.01))
```

will result in a dimension mismatch on the training (back/gradient operator) even though `loss2(a2,b2)`

works.

Am I doing something wrong here or is this a bug?

Best,

Max