Another question: does Turing support sampling on GPU (e.g. CUDA)? I have played around with it but not been successful (e.g. sample below).
However, posts like this one seem to indicate that it’s possible (and should just work out of the box?).
Attempted CUDA code
@model function model_gaussian2(claims,n)
μ ~ Normal(0.05,0.1)
σ ~ Exponential(0.25)
claims .~ Binomial.(n,logistic.(μ))
end
where claims
and n
are CuArrays
of integers:
mg = model_gaussian2(CuArray(claims_summary.claims),CuArray(claims_summary.n))
cg = sample(mg, NUTS(), 500)
Resutls in:
InvalidIRError: compiling kernel #broadcast_kernel#17(CUDA.CuKernelContext, CUDA.CuDeviceVector{Float64, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(StatsAPI.loglikelihood), Tuple{Base.Broadcast.Extruded{CUDA.CuDeviceVector{Distributions.Binomial{Float64}, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Extruded{CUDA.CuDeviceVector{Int64, 1}, Tuple{Bool}, Tuple{Int64}}}}, Int64) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to .text)