I do julia --track-allocation=user b.jl
The relevant report here is:
and
How does this make sense?
I have proposed removing --track-lallocation
since it is prone to false-positives.
I strongly recommend using the allocation profiler instead to gain insights.
I did use Profile.Alloc as well. It reported allocation as well, same spot.
Canβt make sense of it. Btw: @time
reported allocations too. So I donβt think it can all be false positives.
Can you share the proof file? Maybe through https://pprof.me? Looking at what types are allocated might help with your question.
Thank you for your generous offer. The profiling reported here is consistent with that of --track-allocation
:
Is there some scenario where memory would be allocated before the function updatecsmat!
gets called? Because pprof does not even mention updatecsmat!
among places where allocation occursβ¦
Ah yes, the issue is that PProf.jl didnβt store compressed data for the longest time. Version 3 will fix that.
Anyway:
Clicking on βSampleβ gives you two views βAllocsβ and βSizeβ the former is the number of allocations, the latter how much memory was used.
This is the βsizeβ view. We allocate two buffers one of size 76.34MB and one of 7.64MB; then we are also boxing a bunch of integers.
You can walk the chain up and you see that the buffer allocations are coming from sparse operations in makematrix
.
Flipping over to Allocs
we see the number of allocations:
So why would Julia box an integer on a call-site?
code_typed
/Cthulhu
should tell you that, but my guess is that we either needed to box the return value or that one of the arguments to the function is despecialized and we pass it as a box.
I have a lot to learn in the pprof analysis interface! I did notice the boxing: hence my question about updatecsmat!
at the call site! Thanks!
Hmm, I have used @code_warntype
, with the following result:
MethodInstance for FinEtools.FEMMBaseModule._bilform_diffusion_general(::FinEtoolsHeatDiff.FEMMHeatDiffModule.FEMMHeatDiff{FinEtools.IntegDomainModule.IntegDomain{FinEtools.FESetModule.FESetT3{Int64}, typeof(FinEtools.IntegDomainModule.otherdimensionunity), FinEtools.IntegRuleModule.TriRule}, FinEtoolsHeatDiff.MatHeatDiffModule.MatHeatDiff{Float64, typeof(FinEtoolsHeatDiff.MatHeatDiffModule.tangentmoduli!), typeof(FinEtoolsHeatDiff.MatHeatDiffModule.update!)}}, ::FinEtools.AssemblyModule.SysmatAssemblerSparseSymm{Int64, Float64, Vector{Float64}, Vector{Int64}}, ::FinEtools.NodalFieldModule.NodalField{Float64, Int64}, ::FinEtools.NodalFieldModule.NodalField{Float64, Int64}, ::FinEtools.DataCacheModule.DataCache{Matrix{Float64}, FinEtools.DataCacheModule.var"#_fillcache_constant!#1"})
from _bilform_diffusion_general(self::FEMM, assembler::A, geom::FinEtools.NodalFieldModule.NodalField{FT}, u::FinEtools.NodalFieldModule.NodalField{T}, cf::DC) where {FEMM<:AbstractFEMM, A<:AbstractSysmatAssembler, FT, T, DC<:DataCache} @ FinEtools.FEMMBaseModule C:\Users\pkonl\Documents\00WIP\FinEtools.jl\src\FEMMBaseModule.jl:1486
Static Parameters
FEMM = FinEtoolsHeatDiff.FEMMHeatDiffModule.FEMMHeatDiff{FinEtools.IntegDomainModule.IntegDomain{FinEtools.FESetModule.FESetT3{Int64}, typeof(FinEtools.IntegDomainModule.otherdimensionunity), FinEtools.IntegRuleModule.TriRule}, FinEtoolsHeatDiff.MatHeatDiffModule.MatHeatDiff{Float64, typeof(FinEtoolsHeatDiff.MatHeatDiffModule.tangentmoduli!), typeof(FinEtoolsHeatDiff.MatHeatDiffModule.update!)}}
A = FinEtools.AssemblyModule.SysmatAssemblerSparseSymm{Int64, Float64, Vector{Float64}, Vector{Int64}}
FT = Float64
T = Float64
DC = FinEtools.DataCacheModule.DataCache{Matrix{Float64}, FinEtools.DataCacheModule.var"#_fillcache_constant!#1"}
Arguments
#self#::Core.Const(FinEtools.FEMMBaseModule._bilform_diffusion_general)
self::FinEtoolsHeatDiff.FEMMHeatDiffModule.FEMMHeatDiff{FinEtools.IntegDomainModule.IntegDomain{FinEtools.FESetModule.FESetT3{Int64}, typeof(FinEtools.IntegDomainModule.otherdimensionunity), FinEtools.IntegRuleModule.TriRule}, FinEtoolsHeatDiff.MatHeatDiffModule.MatHeatDiff{Float64, typeof(FinEtoolsHeatDiff.MatHeatDiffModule.tangentmoduli!), typeof(FinEtoolsHeatDiff.MatHeatDiffModule.update!)}}
assembler::FinEtools.AssemblyModule.SysmatAssemblerSparseSymm{Int64, Float64, Vector{Float64}, Vector{Int64}}
geom::FinEtools.NodalFieldModule.NodalField{Float64, Int64}
u::FinEtools.NodalFieldModule.NodalField{Float64, Int64}
cf::FinEtools.DataCacheModule.DataCache{Matrix{Float64}, FinEtools.DataCacheModule.var"#_fillcache_constant!#1"}
Locals
@_7::Union{Nothing, Tuple{Int64, Int64}}
@_8::Int64
@_9::Int64
@_10::Int64
@_11::Int64
pc::Matrix{Float64}
w::Matrix{Float64}
gradNparams::Matrix{Matrix{Float64}}
Ns::Matrix{Matrix{Float64}}
npts::Int64
c_gradNT::Matrix{Float64}
RmTJ::Matrix{Float64}
elvec::Vector{Float64}
elmat::Matrix{Float64}
elmdim::Int64
gradN::Matrix{Float64}
J::Matrix{Float64}
loc::Matrix{Float64}
dofnums::Vector{Int64}
ecoords::Matrix{Float64}
ndn::Int64
nne::Int64
fes::FinEtools.FESetModule.FESetT3{Int64}
@_30::Union{Nothing, Tuple{Int64, Int64}}
i::Int64
j::Int64
c::Matrix{Float64}
Jac::Float64
Body::SparseArrays.SparseMatrixCSC{Float64, Int64}
1 β (fes = FinEtools.FEMMBaseModule.finite_elements(self))
β %2 = FinEtools.FEMMBaseModule._buff_b(self, geom, u)::Core.PartialStruct(Tuple{Int64, Int64, Matrix{Float64}, Vector{Int64}, Matrix{Float64}, Matrix{Float64}, Matrix{Float64}}, Any[Core.Const(3), Int64, Matrix{Float64}, Vector{Int64}, Matrix{Float64}, Matrix{Float64}, Matrix{Float64}])
β %3 = Base.indexed_iterate(%2, 1)::Core.Const((3, 2))
β (nne = Core.getfield(%3, 1))
β (@_11 = Core.getfield(%3, 2))
β %6 = Base.indexed_iterate(%2, 2, @_11::Core.Const(2))::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(3)])
β (ndn = Core.getfield(%6, 1))
β (@_11 = Core.getfield(%6, 2))
β %9 = Base.indexed_iterate(%2, 3, @_11::Core.Const(3))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(4)])
β (ecoords = Core.getfield(%9, 1))
β (@_11 = Core.getfield(%9, 2))
β %12 = Base.indexed_iterate(%2, 4, @_11::Core.Const(4))::Core.PartialStruct(Tuple{Vector{Int64}, Int64}, Any[Vector{Int64},
Core.Const(5)])
β (dofnums = Core.getfield(%12, 1))
β (@_11 = Core.getfield(%12, 2))
β %15 = Base.indexed_iterate(%2, 5, @_11::Core.Const(5))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(6)])
β (loc = Core.getfield(%15, 1))
β (@_11 = Core.getfield(%15, 2))
β %18 = Base.indexed_iterate(%2, 6, @_11::Core.Const(6))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(7)])
β (J = Core.getfield(%18, 1))
β (@_11 = Core.getfield(%18, 2))
β %21 = Base.indexed_iterate(%2, 7, @_11::Core.Const(7))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(8)])
β (gradN = Core.getfield(%21, 1))
β %23 = FinEtools.FEMMBaseModule._buff_e(self, geom, u, assembler)::Tuple{Int64, Matrix{Float64}, Vector{Float64}}
β %24 = Base.indexed_iterate(%23, 1)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(2)])
β (elmdim = Core.getfield(%24, 1))
β (@_10 = Core.getfield(%24, 2))
β %27 = Base.indexed_iterate(%23, 2, @_10::Core.Const(2))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(3)])
β (elmat = Core.getfield(%27, 1))
β (@_10 = Core.getfield(%27, 2))
β %30 = Base.indexed_iterate(%23, 3, @_10::Core.Const(3))::Core.PartialStruct(Tuple{Vector{Float64}, Int64}, Any[Vector{Float64}, Core.Const(4)])
β (elvec = Core.getfield(%30, 1))
β %32 = FinEtools.FEMMBaseModule._buff_d(self, geom, u)::Tuple{Matrix{Float64}, Matrix{Float64}}
β %33 = Base.indexed_iterate(%32, 1)::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(2)])
β (RmTJ = Core.getfield(%33, 1))
β (@_9 = Core.getfield(%33, 2))
β %36 = Base.indexed_iterate(%32, 2, @_9::Core.Const(2))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(3)])
β (c_gradNT = Core.getfield(%36, 1))
β %38 = Base.getproperty(self, :integdomain)::FinEtools.IntegDomainModule.IntegDomain{FinEtools.FESetModule.FESetT3{Int64}, typeof(FinEtools.IntegDomainModule.otherdimensionunity), FinEtools.IntegRuleModule.TriRule}
β %39 = FinEtools.FEMMBaseModule.integrationdata(%38)::Tuple{Int64, Matrix{Matrix{Float64}}, Matrix{Matrix{Float64}}, Matrix{Float64}, Matrix{Float64}}
β %40 = Base.indexed_iterate(%39, 1)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(2)])
β (npts = Core.getfield(%40, 1))
β (@_8 = Core.getfield(%40, 2))
β %43 = Base.indexed_iterate(%39, 2, @_8::Core.Const(2))::Core.PartialStruct(Tuple{Matrix{Matrix{Float64}}, Int64}, Any[Matrix{Matrix{Float64}}, Core.Const(3)])
β (Ns = Core.getfield(%43, 1))
β (@_8 = Core.getfield(%43, 2))
β %46 = Base.indexed_iterate(%39, 3, @_8::Core.Const(3))::Core.PartialStruct(Tuple{Matrix{Matrix{Float64}}, Int64}, Any[Matrix{Matrix{Float64}}, Core.Const(4)])
β (gradNparams = Core.getfield(%46, 1))
β (@_8 = Core.getfield(%46, 2))
β %49 = Base.indexed_iterate(%39, 4, @_8::Core.Const(4))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(5)])
β (w = Core.getfield(%49, 1))
β (@_8 = Core.getfield(%49, 2))
β %52 = Base.indexed_iterate(%39, 5, @_8::Core.Const(5))::Core.PartialStruct(Tuple{Matrix{Float64}, Int64}, Any[Matrix{Float64}, Core.Const(6)])
β (pc = Core.getfield(%52, 1))
β %54 = FinEtools.FEMMBaseModule.size(elmat)::Tuple{Int64, Int64}
β %55 = FinEtools.FEMMBaseModule.prod(%54)::Int64
β %56 = FinEtools.FEMMBaseModule.count(fes)::Int64
β %57 = (%55 * %56)::Int64
β %58 = FinEtools.FEMMBaseModule.nalldofs(u)::Int64
β %59 = FinEtools.FEMMBaseModule.nalldofs(u)::Int64
β FinEtools.FEMMBaseModule.startassembly!(assembler, %57, %58, %59)
β %61 = FinEtools.FEMMBaseModule.eachindex(fes)::Core.PartialStruct(UnitRange{Int64}, Any[Core.Const(1), Int64])
β (@_7 = Base.iterate(%61))
β %63 = (@_7 === nothing)::Bool
β %64 = Base.not_int(%63)::Bool
βββ goto #7 if not %64
2 β %66 = @_7::Tuple{Int64, Int64}
β (i = Core.getfield(%66, 1))
β %68 = Core.getfield(%66, 2)::Int64
β %69 = ecoords::Matrix{Float64}
β %70 = Base.getproperty(fes, :conn)::Vector{Tuple{Int64, Int64, Int64}}
β %71 = Base.getindex(%70, i)::Tuple{Int64, Int64, Int64}
β FinEtools.FEMMBaseModule.gathervalues_asmat!(geom, %69, %71)
β %73 = elmat::Matrix{Float64}
β %74 = FinEtools.FEMMBaseModule.zero($(Expr(:static_parameter, 4)))::Core.Const(0.00000e+00)
β FinEtools.FEMMBaseModule.fill!(%73, %74)
β %76 = (1:npts)::Core.PartialStruct(UnitRange{Int64}, Any[Core.Const(1), Int64])
β (@_30 = Base.iterate(%76))
β %78 = (@_30 === nothing)::Bool
β %79 = Base.not_int(%78)::Bool
βββ goto #5 if not %79
3 β %81 = @_30::Tuple{Int64, Int64}
β (j = Core.getfield(%81, 1))
β %83 = Core.getfield(%81, 2)::Int64
β %84 = loc::Matrix{Float64}
β %85 = J::Matrix{Float64}
β %86 = ecoords::Matrix{Float64}
β %87 = Base.getindex(Ns, j)::Matrix{Float64}
β %88 = Base.getindex(gradNparams, j)::Matrix{Float64}
β FinEtools.FEMMBaseModule.locjac!(%84, %85, %86, %87, %88)
β %90 = Base.getproperty(self, :integdomain)::FinEtools.IntegDomainModule.IntegDomain{FinEtools.FESetModule.FESetT3{Int64}, typeof(FinEtools.IntegDomainModule.otherdimensionunity), FinEtools.IntegRuleModule.TriRule}
β %91 = J::Matrix{Float64}
β %92 = loc::Matrix{Float64}
β %93 = Base.getproperty(fes, :conn)::Vector{Tuple{Int64, Int64, Int64}}
β %94 = Base.getindex(%93, i)::Tuple{Int64, Int64, Int64}
β %95 = Base.getindex(Ns, j)::Matrix{Float64}
β (Jac = FinEtools.FEMMBaseModule.Jacobianvolume(%90, %91, %92, %94, %95))
β %97 = Base.getproperty(self, :mcsys)::FinEtools.CSysModule.CSys
β %98 = loc::Matrix{Float64}
β %99 = J::Matrix{Float64}
β %100 = i::Int64
β FinEtools.FEMMBaseModule.updatecsmat!(%97, %98, %99, %100, j)
β %102 = RmTJ::Matrix{Float64}
β %103 = Base.getproperty(self, :mcsys)::FinEtools.CSysModule.CSys
β %104 = FinEtools.FEMMBaseModule.csmat(%103)::Matrix{T} where T<:Number
β FinEtools.FEMMBaseModule.mulCAtB!(%102, %104, J)
β %106 = fes::FinEtools.FESetModule.FESetT3{Int64}
β %107 = gradN::Matrix{Float64}
β %108 = Base.getindex(gradNparams, j)::Matrix{Float64}
β FinEtools.FEMMBaseModule.gradN!(%106, %107, %108, RmTJ)
β (c = (cf)(loc, J, i, j))
β %111 = elmat::Matrix{Float64}
β %112 = gradN::Matrix{Float64}
β %113 = Jac::Float64
β %114 = Base.getindex(w, j)::Float64
β %115 = (%113 * %114)::Float64
β %116 = c::Matrix{Float64}
β FinEtools.FEMMBaseModule.add_gkgt_ut_only!(%111, %112, %115, %116, c_gradNT)
β (@_30 = Base.iterate(%76, %83))
β %119 = (@_30 === nothing)::Bool
β %120 = Base.not_int(%119)::Bool
βββ goto #5 if not %120
4 β goto #3
5 β FinEtools.FEMMBaseModule.complete_lt!(elmat)
β %124 = dofnums::Vector{Int64}
β %125 = Base.getproperty(fes, :conn)::Vector{Tuple{Int64, Int64, Int64}}
β %126 = Base.getindex(%125, i)::Tuple{Int64, Int64, Int64}
β FinEtools.FEMMBaseModule.gatherdofnums!(u, %124, %126)
β FinEtools.FEMMBaseModule.assemble!(assembler, elmat, dofnums, dofnums)
β (@_7 = Base.iterate(%61, %68))
β %130 = (@_7 === nothing)::Bool
β %131 = Base.not_int(%130)::Bool
βββ goto #7 if not %131
6 β goto #2
7 β %134 = FinEtools.FEMMBaseModule.makematrix!(assembler)::SparseArrays.SparseMatrixCSC{Float64, Int64}
βββ return %134
In particular, I see
%100 = i::Int64
β FinEtools.FEMMBaseModule.updatecsmat!(%97, %98, %99, %100, j)
There is no boxing as far as I can tell?
The julia code is:
function _bilform_diffusion_general(
self::FEMM,
assembler::A,
geom::NodalField{FT},
u::NodalField{T},
cf::DC,
) where {FEMM<:AbstractFEMM,A<:AbstractSysmatAssembler,FT,T,DC<:DataCache}
fes = finite_elements(self)
nne, ndn, ecoords, dofnums, loc, J, gradN = _buff_b(self, geom, u)
elmdim, elmat, elvec = _buff_e(self, geom, u, assembler)
RmTJ, c_gradNT = _buff_d(self, geom, u)
npts, Ns, gradNparams, w, pc = integrationdata(self.integdomain)
startassembly!(assembler, prod(size(elmat)) * count(fes), nalldofs(u), nalldofs(u))
for i in eachindex(fes) # Loop over elements
gathervalues_asmat!(geom, ecoords, fes.conn[i])
fill!(elmat, zero(T)) # Initialize element matrix
for j in 1:npts # Loop over quadrature points
locjac!(loc, J, ecoords, Ns[j], gradNparams[j])
Jac = Jacobianvolume(self.integdomain, J, loc, fes.conn[i], Ns[j])
updatecsmat!(self.mcsys, loc, J, i, j)
mulCAtB!(RmTJ, csmat(self.mcsys), J) # local Jacobian matrix
gradN!(fes, gradN, gradNparams[j], RmTJ)
c = cf(loc, J, i, j)
add_gkgt_ut_only!(elmat, gradN, (Jac * w[j]), c, c_gradNT)
end # Loop over quadrature points
complete_lt!(elmat)
gatherdofnums!(u, dofnums, fes.conn[i])# retrieve degrees of freedom
assemble!(assembler, elmat, dofnums, dofnums)# assemble symmetric matrix
end # Loop over elements
return makematrix!(assembler)
end
I know that it is an imposition to ask for an opinion, but perhaps you can see without needing to dig in?
β %97 = Base.getproperty(self, :mcsys)::FinEtools.CSysModule.CSys
This doesnβt look to be concrete β CSys has two parameters, no?
It does. But this is the definition of the self
:
mutable struct FEMMBase{ID<:IntegDomain, CS<:CSys} <: AbstractFEMM
integdomain::ID # domain data
mcsys::CS # updater of the material orientation matrix
end
So, wouldnβt the type be known?
Also, I seem to have trouble with many allocations of integers (boxing).
Thatβs not the type of self:
There mcsys isnβt parameterized.
OMG. I forgot to update here. Thanks!
Also telling is:
%103 = Base.getproperty(self, :mcsys)::FinEtools.CSysModule.CSys %104 = FinEtools.FEMMBaseModule.csmat(%103)::Matrix{T} where T<:Number
Since the eltype of the Matrix is unknown.
But: why did the profiling report boxing of integers?
Because due to the type of %100
being abstract,
%97 = Base.getproperty(self, :mcsys)::FinEtools.CSysModule.CSys
β %98 = loc::Matrix{Float64}
β %99 = J::Matrix{Float64}
β %100 = i::Int64
β FinEtools.FEMMBaseModule.updatecsmat!(%97, %98, %99, %100, j)
You likely ended up with a different calling-convention that requires all arguments to be boxed.
You can normally see that with @code_llvm
.
I see. But there was only a mention of the integers, nothing about boxing matrices. I will check out what you suggested. Thanks.
The type information of CSys being unavailable would cause a slower way of calling updatecsmat!
which requires all the integers to be boxed. The array is already a boxed value.
Ah, I forgot about that!