Removing nested abstract containers

JADekker · January 23, 2025, 11:02am

In my code, I define several structs in which I carry along an AbstractMatrix: sometimes it is an actual matrix, sometimes a transpose, etc.

The type of the entries in the matrix also varies (Float64, Int, Dual, etc.).

I was wondering what the idiomatic Julia way of defining containers for such objects is. Currently I’m using something that resembles ContainerMoreConcrete, but a quick check shows that ContainerConcrete has fewer allocations and better speed in my toy example.

Is that the idiomatic way of doing things? Or is a different approach desirable/better? As a possible alternative, I could force every matrix that I use to be a Matrix, but that seems unnecessarily hurtful for performance? Or is there some hidden benefit to that that I’m overlooking?

using BenchmarkTools 
struct ContainerAbs
    x::AbstractMatrix{Real}
end
struct ContainerMoreConcrete{T<:Real}
    x::AbstractMatrix{T}
end
struct ContainerConcrete{M<:AbstractMatrix{<:Real}}
    x::M
end

function RunCheck(N, a::Real)
    display("Benchmarking with datatype $(typeof(a))")
    X = transpose([convert(typeof(a), i*j) for i in 1:N, j in 1:N])
    ContainerAbs(X); ContainerMoreConcrete(X); ContainerConcrete(X); # Ensure everything is compiled
    display(@benchmark ContainerAbs($X))
    display(@benchmark ContainerMoreConcrete($X))
    display(@benchmark ContainerConcrete($X))
end

RunCheck(10, 1)
RunCheck(10, 1.)
RunCheck(10, 1//1)

gives

"Benchmarking with datatype Int64"
BenchmarkTools.Trial: 10000 samples with 464 evaluations.
 Range (min … max):  229.705 ns …   7.052 μs  ┊ GC (min … max): 0.00% … 96.05%
 Time  (median):     239.043 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   263.967 ns ± 146.187 ns  ┊ GC (mean ± σ):  6.61% ± 11.23%

  █▇▅▃▄▃▂▂▁                                                     ▂
  ██████████▇▇▆▆▅▅▅▄▄▄▃▁▄▁▁▃▁▃▁▁▃▁▁▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▇██▇▇▆▆ █
  230 ns        Histogram: log(frequency) by time        819 ns <

 Memory estimate: 944 bytes, allocs estimate: 2.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  3.958 ns …  2.720 μs  ┊ GC (min … max): 0.00% … 99.41%
 Time  (median):     4.250 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.884 ns ± 29.187 ns  ┊ GC (mean ± σ):  7.69% ±  1.40%

     █ ▂                                                      
  ▁▃▄███▆▆▂▂▂▂▃▄▆▃▃▂▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  3.96 ns        Histogram: frequency by time        7.04 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.958 ns … 26.750 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.000 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.041 ns ±  0.411 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂ █  ▆  ▂  ▃                                               ▁
  █▁█▁▁█▁▁█▁▁█▁▇▁▁█▁▁▅▁▁▆▁▆▁▁▆▁▁▄▁▁▅▁▅▁▁▄▁▁▅▁▁▅▁▄▁▁▆▁▁█▁▁▆▁▆ █
  1.96 ns      Histogram: log(frequency) by time     2.83 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
"Benchmarking with datatype Float64"
BenchmarkTools.Trial: 10000 samples with 220 evaluations.
 Range (min … max):  333.900 ns …  15.987 μs  ┊ GC (min … max):  0.00% … 96.78%
 Time  (median):     384.850 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   455.342 ns ± 466.676 ns  ┊ GC (mean ± σ):  14.45% ± 13.43%

  ▇█▄▄▁                                                         ▁
  █████▇▆▅▁▄▃▃▃▁▃▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▆██▆▆▆▅▄▁▄▅█ █
  334 ns        Histogram: log(frequency) by time       2.99 μs <

 Memory estimate: 2.48 KiB, allocs estimate: 102.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.000 ns …  3.191 μs  ┊ GC (min … max): 0.00% … 99.37%
 Time  (median):     4.542 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.299 ns ± 32.788 ns  ┊ GC (mean ± σ):  7.38% ±  1.39%

    █▄▂▄                                                      
  ▂▇████▆▅█▅▄▅▃▃▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  4 ns           Histogram: frequency by time        9.71 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.917 ns … 84.417 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.083 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.166 ns ±  1.127 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▁ █ ▆ ▅  ▇ ▄ ▆ ▂                          ▁    ▃ ▂ ▁     ▂
  ▃▁█▁█▁█▁█▁▁█▁█▁█▁█▁▁█▁█▁▇▁▆▁▆▅▁▆▁▆▁▆▁▆▁▁▆▁▇▁█▁▆▁▁█▁█▁█▁█▁█ █
  1.92 ns      Histogram: log(frequency) by time        3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
"Benchmarking with datatype Rational{Int64}"
BenchmarkTools.Trial: 10000 samples with 208 evaluations.
 Range (min … max):  353.562 ns … 21.655 μs  ┊ GC (min … max):  0.00% … 96.92%
 Time  (median):     413.260 ns              ┊ GC (median):     0.00%
 Time  (mean ± σ):   561.200 ns ±  1.039 μs  ┊ GC (mean ± σ):  22.77% ± 11.79%

  █▅▂                                                          ▁
  ███▇▅▁▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▇█ █
  354 ns        Histogram: log(frequency) by time      7.81 μs <

 Memory estimate: 4.05 KiB, allocs estimate: 102.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.000 ns …  2.688 μs  ┊ GC (min … max): 0.00% … 99.32%
 Time  (median):     4.417 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.956 ns ± 28.274 ns  ┊ GC (mean ± σ):  7.15% ±  1.39%

       █                                                      
  ▂▂▅▅▇█▇▇▆▅▆▃▃▂▃▇▄▄▄▃▃▄▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  4 ns           Histogram: frequency by time        6.88 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.958 ns … 37.666 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.000 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.037 ns ±  0.473 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂ █  ▆  ▂  ▃                                               ▁
  █▁█▁▁█▁▁█▁▁█▁▁▆▁▁▇▁▁▄▁▁▅▁▁▄▁▅▁▁▅▁▁▄▁▁▅▁▁▃▁▁▄▁▁▄▁▁▄▁▁▅▁▁▇▁▄ █
  1.96 ns      Histogram: log(frequency) by time     2.79 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

mrufsvold · January 23, 2025, 12:32pm

I don’t see what’s wrong with ContainerConcrete for your use case. It provides concrete typing and allows you to use any kind of abstract matrix, no?

JonasWickman · January 23, 2025, 2:21pm

I agree with @mrufsvold that ContainerConcrete is a good way to go. It’s worth noting that the reason your ContainerMoreConcrete is less performant than ContainerConcrete is that ContainerMoreConcrete is still not concrete enough for type inference. Since the type of x in ContainerMoreConcrete has been declared an AbstractMatrix, types will not always be inferable, as in this example below:

function container_multiply(c, v)
    c.x*v 
end

x = [1.0; 2.0;; 3.0; 4.0]
v = [5.0, 6.0]

cmc = ContainerMoreConcrete(x)
cc = ContainerConcrete(x)

Running @code_warntype on the two containers in the function yields:

@code_warntype container_multiply(cmc, v)
MethodInstance for container_multiply(::ContainerMoreConcrete{Float64}, ::Vector{Float64})
  from container_multiply(c, v) @ Main ~/Files/file.jl:21
Arguments
  #self#::Core.Const(Main.container_multiply)
  c::ContainerMoreConcrete{Float64}
  v::Vector{Float64}
Body::Any
1 ─ %1 = Main.:*::Core.Const(*)
│   %2 = Base.getproperty(c, :x)::AbstractMatrix{Float64}
│   %3 = (%1)(%2, v)::Any
└──      return %3

and

@code_warntype container_multiply(cc, v)
MethodInstance for container_multiply(::ContainerConcrete{Matrix{Float64}}, ::Vector{Float64})
  from container_multiply(c, v) @ Main ~/Files/file.jl:21
Arguments
  #self#::Core.Const(Main.container_multiply)
  c::ContainerConcrete{Matrix{Float64}}
  v::Vector{Float64}
Body::Vector{Float64}
1 ─ %1 = Main.:*::Core.Const(*)
│   %2 = Base.getproperty(c, :x)::Matrix{Float64}
│   %3 = (%1)(%2, v)::Vector{Float64}
└──      return %3

JADekker · January 23, 2025, 4:27pm

Thank you both for confirming and clarifying that this is the way to go! I see the type inference advantages indeed!

JADekker · January 23, 2025, 4:39pm

What I’m curious to know: is there a way to automatically find such abstract container types in your code?

Topic		Replies	Views
Using containers with abstract type parameters General Usage	11	917	May 1, 2021
Why does this matrix increment allocate? New to Julia memory-allocation	7	612	February 21, 2022
Container type declaration when computing equivalence classes for performance vs genericity New to Julia performance	1	272	July 9, 2021
Improving Performance of a Loop Performance	9	421	November 3, 2021
Store abstract types in a container Performance	7	655	October 6, 2020

Removing nested abstract containers

Related topics