Great!
I finished the first implementation, I added a “MaxValidIndex” field to my struct:
@with_kw struct ∇ᵢWᵢⱼStruct{T,ST}
NL::Base.RefValue{Int}
MaxValidIndex::Base.RefValue{Int} = deepcopy(NL)
# Input for calculation
xᵢⱼ ::AbstractVector{SVector{ST,T}} = zeros(SVector{ST,T},NL[])
xᵢⱼ² ::AbstractVector{T} = similar(xᵢⱼ,T,NL[])
dᵢⱼ ::AbstractVector{T} = similar(xᵢⱼ,T,NL[])
qᵢⱼ ::AbstractVector{T} = similar(xᵢⱼ,T,NL[])
∇ᵢWᵢⱼ ::AbstractVector{SVector{ST,T}} = similar(xᵢⱼ,NL[])
end
And the resize! function is now as follows:
function Base.resize!(object::∇ᵢWᵢⱼStruct, N::Integer)
if N > object.NL[]
object.NL[] = N
for P in propertynames(object)
arr = getfield(object, P)
if isa(arr,AbstractVector)
resize!(arr, N)
fill!(arr,zero(eltype(arr)))
end
end
else
object.MaxValidIndex[] = N
for P in propertynames(object)
arr = getfield(object, P)
if P !== :xᵢⱼ
if isa(arr,AbstractVector)
fill!(@view(arr[1:N]),zero(eltype(arr)))
end
end
end
end
return nothing
end
So basically if the new N is suffient to be held in the old array, we do not allocate a new one, but instead “clean” values up to new N (MaxValidIndex). All my functions have now been changed to respect MaxValidIndex and I know that the range from 1:MaxValidIndex will always hold the real values, everything above is “trash”.
This also did not give me any noticeable regression in performance so this is awesome. Thank you very much @Henrique_Becker for the idea, since re-allocating all the time would be devasting for my code, i.e imagine 200k iterations with 5 mb realloc each time… now limited
EDIT: Also regarding your point 1 and 2. I noticed that if I increased size of GPU array with 1, it would still allocate a whole new one. It does not seem to be “smart” in the background as on CPU
Kind regards