Following this discussion I have a question about the possibility of using DEDataArrays on GPU. Currently, I can not convert a DEDataArray A
simply as CuArray(A)
, since the CuArray
function will omit all extra fields of A
apart of A.x
. Also, defining A
as a DEDataArray with A.x
being the CuArray does not help since all broadcast operations with A
will result in slow scalar operations. In the following piece of code I try to illustrate the above considerations:
using CUDA
using DiffEqBase
mutable struct MyArray{T, N} <: DEDataArray{T, N}
x :: Array{T, N}
y :: Array{T, 1}
z :: T
end
a = MyArray(ones((2,2)), ones(2), 1.0)
b = MyArray(zeros((2,2)), zeros(2), 0.0)
@show a.x, a.y, a.z # (a.x, a.y, a.z) = ([1.0 1.0; 1.0 1.0], [1.0, 1.0], 1.0)
@show b.x, b.y, b.z # (b.x, b.y, b.z) = ([0.0 0.0; 0.0 0.0], [0.0, 0.0], 0.0)
# Broadcasting copies only the x component of DEDataArrays:
@. b = a
@show b.x, b.y, b.z # (b.x, b.y, b.z) = ([1.0 1.0; 1.0 1.0], [0.0, 0.0], 0.0)
# Copy the rest of the DEDataArrays components:
DiffEqBase.copy_fields!(b, a)
@show b.x, b.y, b.z # (b.x, b.y, b.z) = ([1.0 1.0; 1.0 1.0], [1.0, 1.0], 1.0)
# ------------------------------------------------------------------------------
# CuArray omits the extra components of DEDataArrays:
agpu = CuArray(a)
@show typeof(agpu) # typeof(agpu) = CuArray{Float64,2}
# ------------------------------------------------------------------------------
mutable struct MyCuArray{T, N} <: DEDataArray{T, N}
x :: CuArray{T, N}
y :: CuArray{T, 1}
z :: T
end
a = MyCuArray(CUDA.ones(2), CUDA.ones(2), 1.0f0)
b = MyCuArray(CUDA.zeros(2), CUDA.zeros(2), 0.0f0)
@. b = a # broadcasting results in slow scalar operations
DiffEqBase.copy_fields!(b, a) # copy_fields! is still fast
Are there any simple workaround for this issue? Can it be solved by redefining CuArray
converter or broadcast styles for DEDataArrays? Or the only solution is to define new, let say, DEDataCuArrays which subtype AbstractGPUArrays?