General question on GPU Programming and on how to use low level C API

Olivier_Gagnon · May 15, 2020, 2:09pm

Hey everyone,

I just started to learn GPU Programming (and also rather new to Julia). I am currently trying to go through Cuda by Example but reproducing everything in Julia.

I would have 2 questions (everything is executed in a Jupyter Notebook):

I have the following code to do a simple Hello World example.

using CUDAnative, CUDAdrv

function hello_world()
    @cuprintf("Hello Woarld from the GPU\n")
    return
end

If I run @cuda hello_world() in a different cell, there will be no output until I use synchronize().

While, when I run @cuda hello_world() inside the same cell, I do get an output, but if I change the string I need to run it twice to see the new string. Again, if I add synchronize() this doesn’t happen (new string get printed first time I run the cell).

Not sure I understand what is happening here…

I am trying to run a low-level C API function cuDeviceGetProperties. For this, I need a (prop, dev) with type (Ptr{CUdevprop}, CUdevice) . For the device, I know I can get it with CuDevice(0), but I have no idea for the prop…

I tried defining my own struct with similar field as in the book (which would likely not work since the field likely changed with new version of CUDA), and it fails.

Doing something similar to this :

using CUDAdrv

struct CUdevprop
    (define fields here)
end

prop = Ref{CUdevprop}()
CUDAdrv.cuDeviceGetProperties(prop,CuDevice(0))

Any help would be greatly appreciated

Thanks

R366Y · October 14, 2020, 6:22pm

Hi Oliver ,
regarding question n. 2, to get the device properties you can use CUDA.attribute function. E.g.

using CUDA

function print_gpu_properties()

    for (i,device) in enumerate(CUDA.devices())
        println("*** General properties for device $i ***")
        name = CUDA.name(device)
        println("Device name: $name")
        major = CUDA.attribute(device, CUDA.CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR)
        minor = CUDA.attribute(device, CUDA.CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR)
        println("Compute capabilities: $major.$minor")
        clock_rate = CUDA.attribute(device, CUDA.CU_DEVICE_ATTRIBUTE_CLOCK_RATE)
        println("Clock rate: $clock_rate")
        device_overlap = CUDA.attribute(device, CUDA.CU_DEVICE_ATTRIBUTE_GPU_OVERLAP)
        print("Device copy overlap: ")
        println(device_overlap > 0 ? "enabled" : "disabled")
        kernel_exec_timeout = CUDA.attribute(device, CUDA.CU_DEVICE_ATTRIBUTE_KERNEL_EXEC_TIMEOUT)
        print("Kernel execution timeout: ")
        println(kernel_exec_timeout > 0 ? "enabled" : "disabled")
    end
end

Topic		Replies	Views
Most efficient way of _waiting_ for GPU results? GPU	20	3037	January 31, 2019
How to write device code? GPU question	4	2077	September 4, 2020
CUDAnative: register host memory for pinned memory access GPU question	26	4097	September 3, 2021
Cuda makes Julia freeze GPU cuda	8	575	January 20, 2023
CUDAnative is awesome! GPU	12	5976	December 3, 2018

General question on GPU Programming and on how to use low level C API

Related topics