Tracking the cause of allocation in the presence of a generated function

I have some trouble tracking why the following code allocates:

julia> using HybridArrays, StaticArrays
[ Info: Precompiling HybridArrays [1baab800-613f-4b0a-84e4-9cd3431bfbb9]

julia> using BenchmarkTools

julia> H = HybridArray{Tuple{2,2,StaticArrays.Dynamic()}}(randn(2,2,2))
2×2×2 HybridArray{Tuple{2, 2, StaticArraysCore.Dynamic()}, Float64, 3, 3, Array{Float64, 3}} with indices SOneTo(2)×SOneTo(2)×Base.OneTo(2):
[:, :, 1] =
^[[B  0.363275  -1.4971
 -0.380408   0.34421

[:, :, 2] =
 0.191441  0.681372
 0.667848  0.128238

julia> H2 = HybridArray{Tuple{2,2,StaticArrays.Dynamic()}}(Array{Float64}(undef, 2, 2, 4))
2×2×4 HybridArray{Tuple{2, 2, StaticArraysCore.Dynamic()}, Float64, 3, 3, Array{Float64, 3}} with indices SOneTo(2)×SOneTo(2)×Base.OneTo(4):
[:, :, 1] =
 6.90494e-310  6.90494e-310
 0.0           0.0

[:, :, 2] =
 6.90494e-310  6.90494e-310
 0.0           0.0

[:, :, 3] =
 6.90503e-310  6.90503e-310
 0.0           0.0

[:, :, 4] =
 6.90503e-310  6.90503e-310
 0.0           0.0

julia> f2(H, H2) = copyto!(view(H2,:,:,1:2), H)
f2 (generic function with 1 method)

julia> @btime f2($H, $H2)
  187.921 ns (16 allocations: 512 bytes)
2×2×2 HybridArray{Tuple{2, 2, StaticArraysCore.Dynamic()}, Float64, 3, 3, SubArray{Float64, 3, HybridArray{Tuple{2, 2, StaticArraysCore.Dynamic()}, Float64, 3, 3, Array{Float64, 3}}, Tuple{Base.Slice{SOneTo{2}}, Base.Slice{SOneTo{2}}, UnitRange{Int64}}, true}} with indices SOneTo(2)×SOneTo(2)×Base.OneTo(2):
[:, :, 1] =
  0.363275  -1.4971
 -0.380408   0.34421

[:, :, 2] =
 0.191441  0.681372
 0.667848  0.128238

--track-allocation returns results that don’t seem to make sense, Cthulhu/@code_llvm/@code_native/@code_typed either don’t report anything looking like an allocation or (in case of Cthulhu) give up on a generated function. They seem to not even completely agree what is inlined and what isn’t. Execution time profiling gives the strongest hint, pointing to this line: HybridArrays.jl/indexing.jl at eb631401cfdf3aa048f5728b02a16a3b638c9cad · JuliaArrays/HybridArrays.jl · GitHub as the source of allocations but it doesn’t make any sense to me. It just invokes a setindex! on a subarray, a very straightforward function that shouldn’t allocate. Can someone give me an advice?

I played a litte with this.
For me it is similar with 1.8.5 and 1.9.0-rc3:
193.553 ns (16 allocations: 512 bytes) and 199.266 ns (16 allocations: 512 bytes) respectively.
However, in 1.6.7 it is a little better with 130.306 ns (16 allocations: 256 bytes).
So maybe it is a compiler-problem?

1 Like