Been struggling to figure out why accesses to these struct fields are causing memory allocations. Here is my test code:
module AllocationsTest
struct SubStruct
a::Float64
b::Float64
end
struct AllocationsStruct
arr::Array{SubStruct}
end
function make()
subStructArr = Array{SubStruct}(undef, 1000)
for i = eachindex(subStructArr)
subStructArr[i] = SubStruct(1.0, 2.0)
end
return AllocationsStruct(subStructArr)
end
function access(allocStruct::AllocationsStruct)
return allocStruct.arr[1].a
end
function accessInLoop(allocStruct::AllocationsStruct)
s = 0
for i = eachindex(allocStruct.arr)
s += allocStruct.arr[i].a
end
return s
end
end
I thought this might require some more type annotation to inform the compiler, but even annotating according to the Julia performance tips docs in as many places as I can think of (and making the code far more verbose in the process), the allocations are still there:
module AllocationsTest
struct SubStruct{T<:Float64}
a::T
b::T
end
struct AllocationsStruct{T<:Array{SubStruct{Float64}}}
arr::T
end
function make()
subStructArr = Array{SubStruct{Float64}}(undef, 1000)
for i = eachindex(subStructArr)
subStructArr[i] = SubStruct{Float64}(1.0, 2.0)
end
return AllocationsStruct{Array{SubStruct{Float64}}}(subStructArr)
end
function access(allocStruct::AllocationsStruct{Array{SubStruct{Float64}}})
return allocStruct.arr[1].a
end
function accessInLoop(allocStruct::AllocationsStruct{Array{SubStruct{Float64}}})
s = 0
for i = eachindex(allocStruct.arr)
s += allocStruct.arr[i].a
end
return s
end
end
Creating the struct with make and then running the accessInLoop function on it results in around 1.5k separate allocations totaling 39kb. Even the single access function causes one 32 byte allocation. Why?
I was able to finally get this example to not allocate on every iteration of the accessInLoop
module AllocationsTest
struct SubStruct{T<:Float64}
a::T
b::T
end
struct AllocationsStruct{Q<:SubStruct, T<:Vector{Q}}
arr::T
end
function make()
subStructArr = Vector{SubStruct}(undef, 1000)
for i = eachindex(subStructArr)
subStructArr[i] = SubStruct(1.0, 2.0)
end
return AllocationsStruct{SubStruct{Float64}, Vector{SubStruct{Float64}}}(subStructArr)
end
function access(allocStruct::AllocationsStruct)
return allocStruct.arr[1].a
end
function accessInLoop(allocStruct::AllocationsStruct)
s::Float64 = 0
for i = eachindex(allocStruct.arr)
s += allocStruct.arr[i].a
end
return s
end
end
However, I found that doing return AllocationsStruct{SubStruct{Float64}, Vector{SubStruct{Float64}}}(subStructArr) in make was necessary to make this happen.
In my actual code, I have a large struct with many fields and a variety of types. Is there any way for Julia to infer the correct types without the explicit text duplication of writing it all out in the initialization of the struct when I have told Julia what the types will be when I defined the struct’s fields in the struct block?
There isn’t a benefit to subtyping concrete types (like Float64) in type parameters. This would have been much cleaner (and equally performant) as
struct SubStruct
a::Float64
b::Float64
end
struct AllocationsStruct
arr::Vector{SubStruct}
end
and now you don’t need to specify the computed parameters everywhere. Recall that the reason your very first attempt didn’t work well was because you used Array{SubStruct} (which is incompletely specified) rather than Vector{SubStruct} (which is completely specified) in a struct field.
But I’m suspecting this may not completely answer your questions in your actual use case…
Sorry - I guess I conflated that with the second half of your original response. Making that one change does indeed remove all allocations for the accessInLoop call. Thank you!
I m having a similar problem with a much simpler struct:
struct NMDAVoltageDependency{T<: Float32}
b::T
k::T
mg::T
end
Mg_mM = 1f0
nmda_b = 3.36f0 #(no unit) parameters for voltage dependence of nmda channels
nmda_k = -0.077f0 #Eyal 2018
NMDA = NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM)
print(@allocated getfield(NMDA,:b)) # 16
using BenchmarkTools
@btime begin
a = 0.f0
for x in 1:1000
a+=getfield(NMDA,:b)
end
end
# 40.875 ÎĽs (2000 allocations: 31.25 KiB)
const myNMDA = NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM)
@btime begin
a = 0.f0
for x in 1:1000
a+=getfield(myNMDA,:b)
end
end
# 1.166 ns (0 allocations: 0 bytes)
I don’t understand why/how this is the case. The instance NMDA is immutable and it is being processed within a local scope (the begin). Normally I will pass this to a function, and access it to update some variables of another mutable structs. The access to ALL the field is “boxed” as if the struct was not typestable.
using BenchmarkTools
struct NMDAVoltageDependency{T<:Float32}
b::T
k::T
mg::T
end
Mg_mM = 1f0
nmda_b = 3.36f0 #(no unit) parameters for voltage dependence of nmda channels
nmda_k = -0.077f0 #Eyal 2018
let
local_NMDA = NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM)
@btime for x in 1:1000
getfield(local_NMDA,:b)
end
end
## 21.666 ÎĽs (1000 allocations: 15.62 KiB)
using BenchmarkTools
struct NMDAVoltageDependency{T<:Float32}
b::T
k::T
mg::T
end
Mg_mM = 1f0
nmda_b = 3.36f0 #(no unit) parameters for voltage dependence of nmda channels
nmda_k = -0.077f0 #Eyal 2018
let
local_NMDA = NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM)
end
begin
@btime for x in 1:1000
getfield(local_NMDA,:b)
end
end
## 21.666 ÎĽs (1000 allocations: 15.62 KiB)
julia> function f(nmda)
for i in 1:1000
getfield(nmda, :b)
end
end
f (generic function with 1 method)
julia> nmda = NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM)
NMDAVoltageDependency{Float32}(3.36f0, -0.077f0, 1.0f0)
julia> @btime f($nmda)
1.637 ns (0 allocations: 0 bytes)
julia> @btime let nmda = $(NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM))
for i in 1:1000
getfield(nmda, :b)
end
end
1.693 ns (0 allocations: 0 bytes)
Note, though, that since the functions do not do anything, that timing is not really meaningful.
I am sorry but this code does not solve the actual problem, it works only if you have @btime
@btime expands the macro and define the object as constant within the function scope. But it does not work if I have an actual function, like this:
using BenchmarkTools
using UnPack
struct NMDAVoltageDependency{T<:Float32}
b::T
k::T
mg::T
end
Mg_mM = 1f0
nmda_b = 3.36f0 #(no unit) parameters for voltage dependence of nmda channels
nmda_k = -0.077f0 #Eyal 2018
local_NMDA = let
NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM)
end
struct Neuron{VT<:Vector{Float32}, RT<:NMDAVoltageDependency}
g::VT
NMDA::RT
end
function update_neuron!(neuron::Neuron, index::Int)
@unpack g, NMDA = neuron
g[index] = g[index]*(1-1/NMDA.b) + NMDA.b
end
begin
# I cannot place $ here or I get this error: "syntax: "$" expression outside quote around"
nmda = (NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM))
my_neuron = Neuron(zeros(Float32,1), nmda)
i = 1
@btime for _ in 1:1000
update_neuron!(my_neuron, i)
end
# 85.083 ÎĽs (3000 allocations: 62.50 KiB)
my_neuron.g[i] ## 11.25
end
In general the interpolation doesn’t seem to me a solution, I just would like to make the object my_neuron.nmda type stable.
here my_neuron and i are the non-constant globals. You have to put the complete for loop inside a function, and pass it to that function (or let block):
julia> begin
# I cannot place $ here or I get this error: "syntax: "$" expression outside quote around"
nmda = (NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM))
my_neuron = Neuron(zeros(Float32,1), nmda)
i = 1
@btime let my_neuron = $my_neuron, i = $i
for _ in 1:1000
update_neuron!(my_neuron, i)
end
end
# 85.083 ÎĽs (3000 allocations: 62.50 KiB)
my_neuron.g[i] ## 11.25
end
4.105 ÎĽs (0 allocations: 0 bytes)
11.289598f0
function my_simulation()
nmda = (NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM))
my_neuron = Neuron(zeros(Float32,1), nmda)
i = 1
for _ in 1:1000
update_neuron!(my_neuron, i)
end
return my_neuron.g[i] ## 11.25
end
@btime my_simulation()
# 2.741 ÎĽs (1 allocation: 64 bytes)
However, admitted that I have cases in which some variables are forcefully defined outside the simulation loop, for example in another module, or in the global scope, and this variable is an immutable struct (some parameters).
For example:
nmda = (NMDAVoltageDependency(nmda_b, nmda_k, Mg_mM))
function my_simulation(nmda::NMDAVoltageDependency)
my_neuron = Neuron(zeros(Float32,1), nmda)
i = 1
for _ in 1:1000
update_neuron!(my_neuron, i)
end
return my_neuron.g[i] ## 11.25
end
@btime my_simulation()
# 2.741 ÎĽs (1 allocation: 64 bytes)
Is there any correct approach to redefine the variable within the simulation scope, such that it is type stable and does not require allocations?
If the variable is immutable you cannot modify it (you can create a new one and return it from your function, to redefine some variable of the outer scope), mwe:
julia> i = 1 # immutable
function f(i)
i = i + 1
return i
end
f (generic function with 1 method)
julia> i = f(i) # reassigning global scope i
2
If the variable is mutable, you can mutate it (also, having it passed to the local scope of the function as a parameter):
julia> x = [1] # mutable
function f!(x) # this f! mutates x, we use the ! to *indicate* that
x[1] = x[1] + 1
return nothing # just to make it different from the previous one
end
f! (generic function with 1 method)
julia> f!(x)
julia> x
1-element Vector{Int64}:
2
My point is that in a realistic situation, I may have parameters that are declared outside the scope of the simulation loop (the function scope).
Is there any approach to pass these parameters to the function scope that has same performance of when the parameters are defined inside the function scope?
More in general, is there any macro that will spot which variables are defined as non-constant global variables?
Yes, just pass the values as function arguments. Note that it is the variable binding that needs to be local not the value. A value is always constant in that sense. Consider:
i = 5 # global binding, access is type unstable
function foo(i)
# variable i here refers to the local i
# access is fast
...
end
You can use @code_warntype to inspect a function call and it will highlight all type instabilities. These always occur when accessing untyped globals but can have other reasons. Regardless, the performance penalty is the same so would want to fix that either way.
Example:
julia> foo(x) = n*x
foo (generic function with 1 method)
julia> n=5
5
julia> @code_warntype foo(3)
MethodInstance for foo(::Int64)
from foo(x) @ Main REPL[1]:1
Arguments
#self#::Core.Const(foo)
x::Int64
Body::ANY # these ANY will appear in red!
1 ─ %1 = (Main.n * x)::ANY
└── return %1
Here you can see that n is the global n because it is called Main.n.