Hello!
I am doing some programming on GPU using CUDA.
Instead of writing kernels I am doing the recommended approach of using commonly available functions for GPU programming in CUDA.jl such as “.”, “map” etc.
I noticed that when using custom kernels one would write something akin to:
@cuda threads=x blocks=y
So first question; how do I know what threads and blocks I should use?
Second question for the programming I do above using “default functions”, how do I ensure that it uses the maximum resources available?
Kind regards