Illegal memory access problem CUDA

Jakub_Mitura · November 20, 2021, 5:28pm

I am creating some dynamic shared memory boolean arrays in kernel, and it give me consistently

ERROR: LoadError: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
Stacktrace:
 [1] throw_api_error(res::CUDA.cudaError_enum)
   @ CUDA C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\error.jl:105
 [2] query
   @ C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\stream.jl:102 [inlined]
 [3] synchronize(stream::CuStream; blocking::Bool)
   @ CUDA C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\stream.jl:130
 [4] synchronize (repeats 2 times)
   @ C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\stream.jl:117 [inlined]
 [5] unsafe_copyto!(dest::Vector{UInt16}, doffs::Int64, src::CuArray{UInt16, 1, CUDA.Mem.DeviceBuffer}, soffs::Int64, n::Int64)
   @ CUDA C:\Users\1\.julia\packages\CUDA\9T5Sq\src\array.jl:389
 [6] copyto!
   @ C:\Users\1\.julia\packages\CUDA\9T5Sq\src\array.jl:349 [inlined]
 [7] getindex(xs::CuArray{UInt16, 1, CUDA.Mem.DeviceBuffer}, I::Int64)
   @ GPUArrays C:\Users\1\.julia\packages\GPUArrays\3sW6s\src\host\indexing.jl:89
 [8] top-level scope
   @ c:\GitHub\GitHub\NuclearMedEval\src\playgrounds\convolutionsPlay.jl:71

I am wondering what is wrong here - my assumption is that true size of boolean array in bytes is amount of its entries divided by 8 so 1 bit per entry - am I correct?


using CUDA
dataBdim= (32,24,32)
fp = CUDA.zeros(UInt16,1)
sumInBits = (dataBdim[1]+2)+(dataBdim[2]+2)+(dataBdim[3]+2)+dataBdim[1]+dataBdim[2]+dataBdim[3]
shmemSum = cld(sumInBits,8)#in bytes
function testKernelA(dataBdim,fp)
    resShmem =  @cuDynamicSharedMem(Bool,((dataBdim[1]+2),(dataBdim[2]+2),(dataBdim[3]+2))) 
       sourceShmem =  @cuDynamicSharedMem(Bool,(dataBdim[1],dataBdim[2],dataBdim[3]))
    # naive loop just for presentation of problem
    for i in 1:(dataBdim[1]+2),j in 1:(dataBdim[2]+2), n in 1:(dataBdim[3]+2)
        resShmem[i,j,n]=false
    end
 
    for i in 1:(dataBdim[1]),j in 1:(dataBdim[2]), n in 1:(dataBdim[3])
        sourceShmem[i,j,n]=false
    end
    fp[1]=1
return
end
@cuda threads=(32,5) blocks=(2) shmem=shmemSum  testKernelA(dataBdim,fp)
fp[1]

maleadt · November 22, 2021, 7:04am

How is this ‘in bits’ if you’re nowhere multiplying by sizeof(UInt16) (or 8*sizeof if you actually want this size to be bits)?

Jakub_Mitura · November 22, 2021, 6:07pm

You are right still changing it to

sumInBits = (dataBdim[1]+2)*(dataBdim[2]+2)*(dataBdim[3]+2)+dataBdim[1]*dataBdim[2]*dataBdim[3]

do not solve the problem, but is size of needed if this is boolean array? is it not just bitarray?

maleadt · November 23, 2021, 6:55am

There’s still no sizeof in that expression? And it is needed, CuArray{Bool} doesn’t have the same bitarray-like optimization implemented.

Also, try out CUDA.jl#master, there the dynamic memory accesses are bounds checked so will throw a BoundsError instead of crashing CUDA with an illegal memory access.

Jakub_Mitura · November 23, 2021, 9:23am

Ok, so can I use bit type in shared memory ?

And what do you mean by master, I suppose you deduced that I am using some branch, what is not intended by me , I had found somewhere that shared memory initialization macro should now be a function - this is what you mean ?

Thanks !

carstenbauer · November 23, 2021, 10:01am

The master branch on GitHub, i.e. ] add CUDA#master

You’re probably on the latest stable release (if you didn’t do anything fancy).

Jakub_Mitura · November 23, 2021, 10:09am

Ok , thanks

Jakub_Mitura · November 24, 2021, 6:00am

So I already understand it i suppose , still is there a way to use 3 dimensional bit array in shared memory ? It would be extremely usefull .

maleadt · November 24, 2021, 7:22am

No, the BitArray optimization has not been implemented for CuArray. Just use a regular Bool array. If space is a problem, you’ll need to look into implementing BitArray’s packed layout.

Topic		Replies	Views
CUDA_ERROR_ILLEGAL_ADDRESS with CuArrays/Zygote getindex GPU	4	976	January 3, 2020
Help wanted: CUDA error: an illegal memory access was encountered GPU	1	4502	January 11, 2019
multiple-GPUs per process GPU	3	339	April 27, 2023
@cuDynamicSharedMem : allocating beforehand? GPU	2	1339	January 2, 2018
`check-bounds=no` causes illegal memory access when using `rand()` in CUDA kernel GPU question , bug , error , random	3	85	May 31, 2025

Illegal memory access problem CUDA

Related topics