`ConvTranspose` generates `CUDNN_STATUS_BAD_PARAM (code 3)`

I’m having an issue when creating a 1D ConvTranspose model in Flux, when I run it on the GPU.

The following runs

using Flux, CUDA

x = rand(Float32, 20, 13, 2)
model = ConvTranspose((7,), 13 => 5, relu; pad=(3,1), stride=2)
ŷ = model(x)

But, this generates an error

x = cu(rand(20, 13, 2))
model = ConvTranspose((7,), 13 => 5, relu; pad=(3,1), stride=2)
model = fmap(cu, model)
ŷ = model(x)

Any pointers as to what is going wrong here would be appreciated.

Here’s the error trace:

CUDNNError: CUDNN_STATUS_BAD_PARAM (code 3)                                                                                                                               
Stacktrace:                                                                          
  [1] throw_api_error                                                                                                                                                     
    @ ~/.julia/packages/CUDA/GyIk9/lib/cudnn/error.jl:22                         
  [2] macro expansion                                                                                                                                                     
    @ ~/.julia/packages/CUDA/GyIk9/lib/cudnn/error.jl:39 [inlined]    
  [3] cudnnConvolutionBackwardData                                                                                                                                        
    @ ~/.julia/packages/CUDA/GyIk9/lib/utils/call.jl:26
  [4] #9                                                                                                                                                                  
    @ ~/.julia/packages/NNlibCUDA/Oc2CZ/src/cudnn/conv.jl:69 [inlined]
  [5] with_workspace                                                                                                                                                      
    @ ~/.julia/packages/CUDA/GyIk9/lib/utils/call.jl:77     
  [6] with_workspace (repeats 2 times)                                                                                                                                    
    @ ~/.julia/packages/CUDA/GyIk9/lib/utils/call.jl:53 [inlined]     
  [7] #∇conv_data!#7
    @ ~/.julia/packages/NNlibCUDA/Oc2CZ/src/cudnn/conv.jl:68
  [8] ∇conv_data!
    @ ~/.julia/packages/NNlibCUDA/Oc2CZ/src/cudnn/conv.jl:58 [inlined]
  [9] #∇conv_data#91
    @ ~/.julia/packages/NNlib/9FXPF/src/conv.jl:104 [inlined]
 [10] ∇conv_data
    @ ~/.julia/packages/NNlib/9FXPF/src/conv.jl:101
 [11] ConvTranspose
    @ ~/.julia/packages/Flux/Zz9RI/src/layers/conv.jl:268
 [12] testconv5_875
    @ ./REPL[38]:118

Thanks to @maleadt I set the environment variable JULIA_DEBUG=CUDNN, and found a more helpful error message.

The issue here is that the padding isn’t compatible with the convolution applied. I thought through my problem and ended up setting pad=SamePad() which turns out to be (3,2,0,0) in the above case.

At some point I’d like to find some @assert or error statements that might be worth adding in Flux to check on these things.

These are the messages I get before the error without any env vars set, are you perhaps on an older version of Flux?

julia> ŷ = model(x)
┌ Warning: cuDNN does not support asymmetric padding; defaulting to symmetric choice
└ @ NNlibCUDA ~/.julia/packages/NNlibCUDA/Oc2CZ/src/cudnn/cudnn.jl:10
┌ Warning: No valid algorithm found, probably bad params for convolution.
└ @ CUDA.CUDNN ~/.julia/packages/CUDA/VGl9W/lib/cudnn/convolution.jl:225

P.S. Flux exports a gpu function that does fmap(cu, ...) under the hood, but gracefully degrades when no GPU is available and is smarter about not converting non GPU compatible fields.

Ah, thanks @ToucheSir, I didn’t know gpu was smarter about that.

Odd… I have only gotten those errors with the environment variable set. :thinking:

That’s odd, @warn should be on by default. Do you mind running a quick ] status?

Sure:

     Project SleepEvents v0.1.0
      Status `/JuliaProject/Project.toml`
  [fbe9abb3] AWS v1.54.0
  [1c724243] AWSS3 v0.8.6
  [28312eec] Alert v0.2.3
  [19f6540d] AlertPushover v0.1.2
  [cbdf2221] AlgebraOfGraphics v0.4.10
  [69666777] Arrow v1.6.2
  [052768ef] CUDA v3.4.1 `https://github.com/JuliaGPU/CUDA.jl.git#78379e1`
  [13f3f980] CairoMakie v0.6.3
  [717857b8] DSP v0.7.3
  [a93c6f00] DataFrames v1.2.2
  [1313f7d8] DataFramesMeta v0.8.0
  [31c24e10] Distributions v0.25.11
  [48062228] FilePathsBase v0.9.10
  [587475ba] Flux v0.12.6
  [d9f16b24] Functors v0.2.3
  [5903a43b] Infiltrator v1.0.3
  [741b9549] Legolas v0.2.3
  [eb5f792d] LegolasFlux v0.1.1
  [e853f5be] Onda v0.13.8
  [92933f4c] ProgressMeter v1.7.1
  [74087812] Random123 v1.4.2
  [e6cf234a] RandomNumbers v1.5.3
  [295af30f] Revise v3.1.19
  [1277b4bf] ShiftedArrays v1.0.0
  [2913bbd2] StatsBase v0.33.9
  [bd369af6] Tables v1.5.0
  [899adc3e] TensorBoardLogger v0.1.18
  [bb34ddd2] TimeSpans v0.2.3
  [28d57a85] Transducers v0.4.65
  [e88e6eb3] Zygote v0.6.19
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [56ddb016] Logging
  [9a3f8284] Random
  [cf7118a7] UUIDs

It’s possible they were somehow lost to oblivion for other reasons: I am running some of the tests on a cluster, so there are some finicky bits around output across the workers.

1 Like

I’m not sure how logging messages work in a distributed context, so that may be it. If you can repro it in a fresh/temp env locally though, do file an issue.

1 Like

A bit tangential, but someone should really add a PR to CUDA to do manual padding instead.

The fallback to symmetric always causes a crash because the user is expected to feed in the output array which naturally has the shape it should have given asymmetric padding.

Even if they did it according to the fallback the model builder might get surprised if they eg try to flatten the output into a dense layer and one size works on cpu and another size works on the gpu.

1 Like