I am trying to use cuDNN.jl for GPU accelerated convolution.

I used the function `cuDNN.cudnnConvolutionForward`

. The function has the keyword argument `format`

that specifies the order of dimensions (`format=cuDNN.CUDNN_TENSOR_NHWC`

or `format=CUDNN_TENSOR_NCHW`

). However, the Julia dimensions have the opposite order.

However, my data is in real NCHW order (not in the opposite order).

I used `permutedims`

as a work-around:

```
function conv_cudnn(x, w, b; stride=(1, 1), padding=(0, 0), dilation=(1, 1), groups=1)
x = CuArray(x)
x = permutedims(x, (4, 3, 2, 1))
w = CuArray(w)
w = permutedims(w, (4, 3, 2, 1))
b = CuArray(b)
b = reshape(b, (1, 1, length(b), 1))
y = CUDA.@time cuDNN.cudnnConvolutionForward(w, x, bias=b, padding=padding, stride=stride, dilation=dilation, group=groups, reorderType=cuDNN.CUDNN_DEFAULT_REORDER, mode=cuDNN.CUDNN_CROSS_CORRELATION)
return permutedims(y, (4, 3, 2, 1))
end
```

My inputs look like this:

```
# define inputs (real NCHW order)
x = rand(32, 16, 64, 64)
w = rand(32, 8, 5, 5)
b = rand(32)
```

Is there a better or faster way to use cuDNN with “real” NCHW order (e.g. without using `permutedims`

)?

Best regards and thank you in advance!