I am trying to use cuDNN.jl for GPU accelerated convolution.
I used the function cuDNN.cudnnConvolutionForward
. The function has the keyword argument format
that specifies the order of dimensions (format=cuDNN.CUDNN_TENSOR_NHWC
or format=CUDNN_TENSOR_NCHW
). However, the Julia dimensions have the opposite order.
However, my data is in real NCHW order (not in the opposite order).
I used permutedims
as a work-around:
function conv_cudnn(x, w, b; stride=(1, 1), padding=(0, 0), dilation=(1, 1), groups=1)
x = CuArray(x)
x = permutedims(x, (4, 3, 2, 1))
w = CuArray(w)
w = permutedims(w, (4, 3, 2, 1))
b = CuArray(b)
b = reshape(b, (1, 1, length(b), 1))
y = CUDA.@time cuDNN.cudnnConvolutionForward(w, x, bias=b, padding=padding, stride=stride, dilation=dilation, group=groups, reorderType=cuDNN.CUDNN_DEFAULT_REORDER, mode=cuDNN.CUDNN_CROSS_CORRELATION)
return permutedims(y, (4, 3, 2, 1))
end
My inputs look like this:
# define inputs (real NCHW order)
x = rand(32, 16, 64, 64)
w = rand(32, 8, 5, 5)
b = rand(32)
Is there a better or faster way to use cuDNN with “real” NCHW order (e.g. without using permutedims
)?
Best regards and thank you in advance!