Multi-tasking and threading are fairly recent additions to the Julia/CUDA stack, so some bugs are to be expected. Importantly, every task gets its own CUDNN handle, so you can’t leak handle-local data between tasks. But the backtrace here seems to point to where that handle gets created; I assume that’s the first CUDNN operation in a newly-created task (if not, something’s up with handle creation)? Maybe there’s a limit on how many handles we can create. 200MB free also isn’t much, so maybe creation fails because it runs out of memory and we need to retry after running a GC iteration. You can try that with the following patch:
--- a/lib/cudnn/base.jl
+++ b/lib/cudnn/base.jl
@@ -1,6 +1,9 @@
function cudnnCreate()
handle = Ref{cudnnHandle_t}()
- cudnnCreate(handle)
+ res = @retry_reclaim CUDNN_STATUS_INTERNAL_ERROR unsafe_cudnnCreate(handle)
+ if res != CUDNN_STATUS_SUCCESS
+ throw_api_error(res)
+ end
return handle[]
end