Been struggling to figure out why accesses to these struct fields are causing memory allocations. Here is my test code:
module AllocationsTest
struct SubStruct
a::Float64
b::Float64
end
struct AllocationsStruct
arr::Array{SubStruct}
end
function make()
subStructArr = Array{SubStruct}(undef, 1000)
for i = eachindex(subStructArr)
subStructArr[i] = SubStruct(1.0, 2.0)
end
return AllocationsStruct(subStructArr)
end
function access(allocStruct::AllocationsStruct)
return allocStruct.arr[1].a
end
function accessInLoop(allocStruct::AllocationsStruct)
s = 0
for i = eachindex(allocStruct.arr)
s += allocStruct.arr[i].a
end
return s
end
end
I thought this might require some more type annotation to inform the compiler, but even annotating according to the Julia performance tips docs in as many places as I can think of (and making the code far more verbose in the process), the allocations are still there:
module AllocationsTest
struct SubStruct{T<:Float64}
a::T
b::T
end
struct AllocationsStruct{T<:Array{SubStruct{Float64}}}
arr::T
end
function make()
subStructArr = Array{SubStruct{Float64}}(undef, 1000)
for i = eachindex(subStructArr)
subStructArr[i] = SubStruct{Float64}(1.0, 2.0)
end
return AllocationsStruct{Array{SubStruct{Float64}}}(subStructArr)
end
function access(allocStruct::AllocationsStruct{Array{SubStruct{Float64}}})
return allocStruct.arr[1].a
end
function accessInLoop(allocStruct::AllocationsStruct{Array{SubStruct{Float64}}})
s = 0
for i = eachindex(allocStruct.arr)
s += allocStruct.arr[i].a
end
return s
end
end
Creating the struct with make and then running the accessInLoop function on it results in around 1.5k separate allocations totaling 39kb. Even the single access function causes one 32 byte allocation. Why?
I was able to finally get this example to not allocate on every iteration of the accessInLoop
module AllocationsTest
struct SubStruct{T<:Float64}
a::T
b::T
end
struct AllocationsStruct{Q<:SubStruct, T<:Vector{Q}}
arr::T
end
function make()
subStructArr = Vector{SubStruct}(undef, 1000)
for i = eachindex(subStructArr)
subStructArr[i] = SubStruct(1.0, 2.0)
end
return AllocationsStruct{SubStruct{Float64}, Vector{SubStruct{Float64}}}(subStructArr)
end
function access(allocStruct::AllocationsStruct)
return allocStruct.arr[1].a
end
function accessInLoop(allocStruct::AllocationsStruct)
s::Float64 = 0
for i = eachindex(allocStruct.arr)
s += allocStruct.arr[i].a
end
return s
end
end
However, I found that doing return AllocationsStruct{SubStruct{Float64}, Vector{SubStruct{Float64}}}(subStructArr) in make was necessary to make this happen.
In my actual code, I have a large struct with many fields and a variety of types. Is there any way for Julia to infer the correct types without the explicit text duplication of writing it all out in the initialization of the struct when I have told Julia what the types will be when I defined the struct’s fields in the struct block?
There isn’t a benefit to subtyping concrete types (like Float64) in type parameters. This would have been much cleaner (and equally performant) as
struct SubStruct
a::Float64
b::Float64
end
struct AllocationsStruct
arr::Vector{SubStruct}
end
and now you don’t need to specify the computed parameters everywhere. Recall that the reason your very first attempt didn’t work well was because you used Array{SubStruct} (which is incompletely specified) rather than Vector{SubStruct} (which is completely specified) in a struct field.
But I’m suspecting this may not completely answer your questions in your actual use case…
Sorry - I guess I conflated that with the second half of your original response. Making that one change does indeed remove all allocations for the accessInLoop call. Thank you!