Is there a reason for Convolution (and other CUDNN related functions) to be restricted to
DenseCuArray as found in the NNlibCUDA.jl repo?
For example, taking a view here will result in an error as CUDA has to resort to CPU as the
Conv fucntion doesn’t support the
view on the
using CUDA using Flux m = Conv((1,), 3 => 4) |> gpu x1 = CUDA.rand(1,3,8); m(x1); # works fine x2 = CUDA.rand(1,4,8); x3 = view(x2, :, 1:3, :); m(x3); ERROR: TaskFailedException nested task error: Scalar indexing is disallowed.
Performing a Dense operator on a view is working fine however (as Dense function supports AbstractArray input and doesn’t rely on CUDNN).
The actual use case is that I’m using a custom dataloader in which the full dataset is stored as a CuArray and I’d like it to provide the data batches as views to avoid allocations. Are CUDNN operators inherently forbidding such optimization?