Removing nested abstract containers

In my code, I define several structs in which I carry along an AbstractMatrix: sometimes it is an actual matrix, sometimes a transpose, etc.

The type of the entries in the matrix also varies (Float64, Int, Dual, etc.).

I was wondering what the idiomatic Julia way of defining containers for such objects is. Currently I’m using something that resembles ContainerMoreConcrete, but a quick check shows that ContainerConcrete has fewer allocations and better speed in my toy example.

Is that the idiomatic way of doing things? Or is a different approach desirable/better? As a possible alternative, I could force every matrix that I use to be a Matrix, but that seems unnecessarily hurtful for performance? Or is there some hidden benefit to that that I’m overlooking?

using BenchmarkTools 
struct ContainerAbs
    x::AbstractMatrix{Real}
end
struct ContainerMoreConcrete{T<:Real}
    x::AbstractMatrix{T}
end
struct ContainerConcrete{M<:AbstractMatrix{<:Real}}
    x::M
end

function RunCheck(N, a::Real)
    display("Benchmarking with datatype $(typeof(a))")
    X = transpose([convert(typeof(a), i*j) for i in 1:N, j in 1:N])
    ContainerAbs(X); ContainerMoreConcrete(X); ContainerConcrete(X); # Ensure everything is compiled
    display(@benchmark ContainerAbs($X))
    display(@benchmark ContainerMoreConcrete($X))
    display(@benchmark ContainerConcrete($X))
end

RunCheck(10, 1)
RunCheck(10, 1.)
RunCheck(10, 1//1)

gives

"Benchmarking with datatype Int64"
BenchmarkTools.Trial: 10000 samples with 464 evaluations.
 Range (min … max):  229.705 ns …   7.052 ΞΌs  β”Š GC (min … max): 0.00% … 96.05%
 Time  (median):     239.043 ns               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   263.967 ns Β± 146.187 ns  β”Š GC (mean Β± Οƒ):  6.61% Β± 11.23%

  β–ˆβ–‡β–…β–ƒβ–„β–ƒβ–‚β–‚β–                                                     β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–†β–†β–…β–…β–…β–„β–„β–„β–ƒβ–β–„β–β–β–ƒβ–β–ƒβ–β–β–ƒβ–β–β–ƒβ–β–ƒβ–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–…β–‡β–ˆβ–ˆβ–‡β–‡β–†β–† β–ˆ
  230 ns        Histogram: log(frequency) by time        819 ns <

 Memory estimate: 944 bytes, allocs estimate: 2.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  3.958 ns …  2.720 ΞΌs  β”Š GC (min … max): 0.00% … 99.41%
 Time  (median):     4.250 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.884 ns Β± 29.187 ns  β”Š GC (mean Β± Οƒ):  7.69% Β±  1.40%

     β–ˆ β–‚                                                      
  β–β–ƒβ–„β–ˆβ–ˆβ–ˆβ–†β–†β–‚β–‚β–‚β–‚β–ƒβ–„β–†β–ƒβ–ƒβ–‚β–ƒβ–‚β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β– β–‚
  3.96 ns        Histogram: frequency by time        7.04 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.958 ns … 26.750 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.000 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.041 ns Β±  0.411 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–‚ β–ˆ  β–†  β–‚  β–ƒ                                               ▁
  β–ˆβ–β–ˆβ–β–β–ˆβ–β–β–ˆβ–β–β–ˆβ–β–‡β–β–β–ˆβ–β–β–…β–β–β–†β–β–†β–β–β–†β–β–β–„β–β–β–…β–β–…β–β–β–„β–β–β–…β–β–β–…β–β–„β–β–β–†β–β–β–ˆβ–β–β–†β–β–† β–ˆ
  1.96 ns      Histogram: log(frequency) by time     2.83 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
"Benchmarking with datatype Float64"
BenchmarkTools.Trial: 10000 samples with 220 evaluations.
 Range (min … max):  333.900 ns …  15.987 ΞΌs  β”Š GC (min … max):  0.00% … 96.78%
 Time  (median):     384.850 ns               β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   455.342 ns Β± 466.676 ns  β”Š GC (mean Β± Οƒ):  14.45% Β± 13.43%

  β–‡β–ˆβ–„β–„β–                                                         ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–…β–β–„β–ƒβ–ƒβ–ƒβ–β–ƒβ–β–β–β–β–β–β–ƒβ–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–ƒβ–†β–ˆβ–ˆβ–†β–†β–†β–…β–„β–β–„β–…β–ˆ β–ˆ
  334 ns        Histogram: log(frequency) by time       2.99 ΞΌs <

 Memory estimate: 2.48 KiB, allocs estimate: 102.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.000 ns …  3.191 ΞΌs  β”Š GC (min … max): 0.00% … 99.37%
 Time  (median):     4.542 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   5.299 ns Β± 32.788 ns  β”Š GC (mean Β± Οƒ):  7.38% Β±  1.39%

    β–ˆβ–„β–‚β–„                                                      
  β–‚β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–†β–…β–ˆβ–…β–„β–…β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–β–β–‚β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β– β–‚
  4 ns           Histogram: frequency by time        9.71 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.917 ns … 84.417 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.083 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.166 ns Β±  1.127 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

    ▁ β–ˆ β–† β–…  β–‡ β–„ β–† β–‚                          ▁    β–ƒ β–‚ ▁     β–‚
  β–ƒβ–β–ˆβ–β–ˆβ–β–ˆβ–β–ˆβ–β–β–ˆβ–β–ˆβ–β–ˆβ–β–ˆβ–β–β–ˆβ–β–ˆβ–β–‡β–β–†β–β–†β–…β–β–†β–β–†β–β–†β–β–†β–β–β–†β–β–‡β–β–ˆβ–β–†β–β–β–ˆβ–β–ˆβ–β–ˆβ–β–ˆβ–β–ˆ β–ˆ
  1.92 ns      Histogram: log(frequency) by time        3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
"Benchmarking with datatype Rational{Int64}"
BenchmarkTools.Trial: 10000 samples with 208 evaluations.
 Range (min … max):  353.562 ns … 21.655 ΞΌs  β”Š GC (min … max):  0.00% … 96.92%
 Time  (median):     413.260 ns              β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   561.200 ns Β±  1.039 ΞΌs  β”Š GC (mean Β± Οƒ):  22.77% Β± 11.79%

  β–ˆβ–…β–‚                                                          ▁
  β–ˆβ–ˆβ–ˆβ–‡β–…β–β–„β–„β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–…β–‡β–ˆ β–ˆ
  354 ns        Histogram: log(frequency) by time      7.81 ΞΌs <

 Memory estimate: 4.05 KiB, allocs estimate: 102.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.000 ns …  2.688 ΞΌs  β”Š GC (min … max): 0.00% … 99.32%
 Time  (median):     4.417 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.956 ns Β± 28.274 ns  β”Š GC (mean Β± Οƒ):  7.15% Β±  1.39%

       β–ˆ                                                      
  β–‚β–‚β–…β–…β–‡β–ˆβ–‡β–‡β–†β–…β–†β–ƒβ–ƒβ–‚β–ƒβ–‡β–„β–„β–„β–ƒβ–ƒβ–„β–‚β–‚β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β– β–‚
  4 ns           Histogram: frequency by time        6.88 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  1.958 ns … 37.666 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.000 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.037 ns Β±  0.473 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–‚ β–ˆ  β–†  β–‚  β–ƒ                                               ▁
  β–ˆβ–β–ˆβ–β–β–ˆβ–β–β–ˆβ–β–β–ˆβ–β–β–†β–β–β–‡β–β–β–„β–β–β–…β–β–β–„β–β–…β–β–β–…β–β–β–„β–β–β–…β–β–β–ƒβ–β–β–„β–β–β–„β–β–β–„β–β–β–…β–β–β–‡β–β–„ β–ˆ
  1.96 ns      Histogram: log(frequency) by time     2.79 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

I don’t see what’s wrong with ContainerConcrete for your use case. It provides concrete typing and allows you to use any kind of abstract matrix, no?

I agree with @mrufsvold that ContainerConcrete is a good way to go. It’s worth noting that the reason your ContainerMoreConcrete is less performant than ContainerConcrete is that ContainerMoreConcrete is still not concrete enough for type inference. Since the type of x in ContainerMoreConcrete has been declared an AbstractMatrix, types will not always be inferable, as in this example below:

function container_multiply(c, v)
    c.x*v 
end

x = [1.0; 2.0;; 3.0; 4.0]
v = [5.0, 6.0]

cmc = ContainerMoreConcrete(x)
cc = ContainerConcrete(x)

Running @code_warntype on the two containers in the function yields:

@code_warntype container_multiply(cmc, v)
MethodInstance for container_multiply(::ContainerMoreConcrete{Float64}, ::Vector{Float64})
  from container_multiply(c, v) @ Main ~/Files/file.jl:21
Arguments
  #self#::Core.Const(Main.container_multiply)
  c::ContainerMoreConcrete{Float64}
  v::Vector{Float64}
Body::Any
1 ─ %1 = Main.:*::Core.Const(*)
β”‚   %2 = Base.getproperty(c, :x)::AbstractMatrix{Float64}
β”‚   %3 = (%1)(%2, v)::Any
└──      return %3

and

@code_warntype container_multiply(cc, v)
MethodInstance for container_multiply(::ContainerConcrete{Matrix{Float64}}, ::Vector{Float64})
  from container_multiply(c, v) @ Main ~/Files/file.jl:21
Arguments
  #self#::Core.Const(Main.container_multiply)
  c::ContainerConcrete{Matrix{Float64}}
  v::Vector{Float64}
Body::Vector{Float64}
1 ─ %1 = Main.:*::Core.Const(*)
β”‚   %2 = Base.getproperty(c, :x)::Matrix{Float64}
β”‚   %3 = (%1)(%2, v)::Vector{Float64}
└──      return %3
1 Like

Thank you both for confirming and clarifying that this is the way to go! I see the type inference advantages indeed!

What I’m curious to know: is there a way to automatically find such abstract container types in your code?