Multiprecision arithmetic in CUDA

RE: Compary or something similar
Is this already implemented or can it be done in Julia and/or Python ?

links to 2018 source code for multiprecision arithmetic in CUDA

link to source Campary

Purpose is to get multiprecision on the GPU ( 32 is not enough )
I would like to use this if it is available or assist in getting it going.
.

1 Like

This just works:

using CUDAnative, CuArrays, DoubleFloats

T = Double64
# T = Float64

function sum_plus_mul!(c, a, b)
   @cuda blocks = 1 threads = 32 kernel(c, a, b)
   CUDAnative.synchronize()
end
function kernel(c, a, b)
   i = threadIdx().x
   while i <= length(c)
       c[i] = a[i] + b[i] + a[i] * b[i]
       i += 32
   end
   return
end
a = rand(T, 10) |> CuArrays.CuArray;
b = rand(T, 10) |> CuArrays.CuArray;
c = similar(a);

sum_plus_mul!(c, a, b)
4 Likes