Why does wrapping an array in a composite type degrade performance?

mroavi · February 28, 2021, 4:50pm

The code below performs 2 broadcast multiplications between 2 3D-arrays. The result of the latter is wrapped in a composite type. Does anybody know why the former is faster and allocates less memory? Is it a benchmarking artifact? Or does the composite type abstraction incur a performance cost?

using BenchmarkTools

struct MyFactor{vars, card, T}
  vals::T
end

a_vals = rand(2,3,1);
b_vals = rand(2,1,2);

c_vars = (2,3,4);
c_card = (2,3,2);

@btime $a_vals .* $b_vals;
@btime MyFactor{$c_vars, $c_card, Array{Float64, length($c_vars)}}($a_vals .* $b_vals);

Output:

  72.113 ns (1 allocation: 176 bytes)
  282.406 ns (4 allocations: 256 bytes)

orialb · February 28, 2021, 6:58pm

It seems that having those tuples in the type signature is causing the extra allocations.

If I define:

struct MyFactor{ET,N}
  vars::NTuple{N,Int64}
  card::NTuple{N,Int64}
  vals::Array{ET,N}
end

I get

@btime MyFactor{Float64,length($c_vars)}($c_vars, $c_card, $a_vals .* $b_vals)

80.668 ns (1 allocation: 176 bytes)

(and on my machine the bare a_vals .* b_vals is 78 ns ).

I don’t know what is your actual use case, so no clue if the alternative definition is helpful for you or not.

mroavi · March 4, 2021, 9:14am

Thanks @orialb for your suggestion. The reason I wanted to have vars (variables) and card (cardinality or dimension size) as type parameters is to avoid type instability. Here is one example of a function that takes the MyFactor type as argument:

function marginalize(A::MyFactor{T,N} where N, V::Vector{Int64}) where T 
  dims = indexin(V, collect(A.vars)) # map vars to dims
  r_size = ntuple(d->d in dims ? 1 : size(A.vals,d), getdims(A)) # assign 1 to summed out dims
  ret_size = filter(s -> s != 1, r_size)
  ret_vars = filter(v -> v ∉ V, A.vars)
  r_vals = similar(A.vals, r_size)
  ret_vals = sum!(r_vals, A.vals) |> x -> dropdims(x, dims=Tuple(dims))
  MyFactor{eltype(A.vals),length(ret_vars)}(ret_vars, ret_vals)
end

Using your MyFactor definition, the compiled code is not type stable.

So ideally what I would like, is to have vars and card in the type (which gives me type stability) but without having to allocate data on the heap (which apparently is being done based on the info of my first message)

marius311 · March 4, 2021, 9:49am

Breaking up the second test into a function to make it clearer:

test2(a_vals, b_vals, c_vars, c_card) = MyFactor{c_vars, c_card, Array{Float64, length(c_vars)}}(a_vals .* b_vals)
@btime test2($a_vals, $b_vals, $c_vars, $c_card)
@code_warntype test2(a_vals, b_vals, c_vars, c_card)

the last line shows the problem, its not type stable as you noted (hence why its slower), and its not type stable because the type that you create depends on the values of c_vars and c_card, since you’re explicitly sticking them into the type, so the compiler can’t know at compile type what type to return.

If you really want to stick those values into the type, the canonical way to do it type stabily is to use Val types, e.g. this has no overhead compared to your first test:

test3(a_vals, b_vals, ::Val{c_vars}, ::Val{c_card}) where {c_vars, c_card} = MyFactor{c_vars, c_card, Array{Float64, length(c_vars)}}(a_vals .* b_vals)
@btime test3($a_vals, $b_vals, $(Val(c_vars)), $(Val(c_card)))

but note that this means that this function and any function that takes a MyFactor argument will be recompiled for every different possible value of c_vars and c_card you use. I would consider if you truly need those in the type, perhaps you do (although its not clear to me from your last example), but if you don’t, you’ll get faster compilation times and clearer code if you leave them out of the type.

Topic		Replies	Views
Type defintions of composite types with concrete dimensions Performance	1	372	June 6, 2019
Type Instability in performance critical function Performance	17	697	March 15, 2021
A question about how arrays work, how memory is allocated and what happen when chunks of code inside a function are moved into another function Performance	10	393	May 13, 2022
Impact of specifying input types on function performance, when referencing arguments from Vector{Any} Performance	16	608	November 10, 2022
Allocations in construction of composite type with StaticArray field Performance question	2	544	July 18, 2018

Why does wrapping an array in a composite type degrade performance?

Related topics