Is there a reason for Convolution (and other CUDNN related functions) to be restricted to DenseCuArray
as found in the NNlibCUDA.jl repo?
For example, taking a view here will result in an error as CUDA has to resort to CPU as the Conv
fucntion doesn’t support the view
on the CuArray
:
using CUDA
using Flux
m = Conv((1,), 3 => 4) |> gpu
x1 = CUDA.rand(1,3,8);
m(x1); # works fine
x2 = CUDA.rand(1,4,8);
x3 = view(x2, :, 1:3, :);
m(x3);
ERROR: TaskFailedException
nested task error: Scalar indexing is disallowed.
Performing a Dense operator on a view is working fine however (as Dense function supports AbstractArray input and doesn’t rely on CUDNN).
The actual use case is that I’m using a custom dataloader in which the full dataset is stored as a CuArray and I’d like it to provide the data batches as views to avoid allocations. Are CUDNN operators inherently forbidding such optimization?