Domains   GPU

CuArray and Optim (10)
GPUArrays, 64-32bit conversions, and Cassete.jl (9)
Value Function Iteration on GPU (4)
LLVM crash when running Flux and CuArray examples in julia 0.7 (14)
Flux: GPU slower than CPU? (8)
CLBlast, a tuned OpenCL BLAS library (7)
CUDAdrv cannot find __host__ __device__ functions (6)
What is the recommended type <: Integer to use when doing index arithmetics? (4)
Packing structs for OpenCL (2)
Sequence of warp and how to avoid divergence when folding shared memory in a reduction kernel (4)
Constant Memory? (12)
Generic Kernels for CLArrays (2)
Load JULIA via "julia -p 8" and failed to load CUDAnative library (4)
Calling CUBLAS GEMM in Julia 0.6 (5)
Optimizing column reduce with CUDAnative (6)
What is the optimal way of updating CuArray? (8)
What is the maximal number of arguments a CUDAnative kernel can take? argc = 16 yields "Error: invalid kernel call; too many arguments" (7)
Store CuArrays on a mutable struct? (6)
Can I change the nvcc location in CUDAnative? (6)
Mapping ThreadIdx().x to a 5D array? (9)
Strange behaviour of @cuprintf? (4)
Problem with CUDAintrinsic pow: pow(y[1,1],2.0)? (3)
How to write device code? (4)
Best way to call an OpenCL kernel with arguments of type CLArray (7)
Multiple GPUs with Julia (3)
LLVM LoadError: Permission Denied (EACCES) (5)
Initializing @cuStaticSharedMem array? (4)
Stack overflow on cuda (10)
Understanding GPU Kernels (5)
Ccall: could not find function ** in library /usr/local/cuda-7.0/lib64/ (2)