Parametric Structs

From a performance stand point, which is better:

struct FlexibleSystem{D, S, TCELL} <: AbstractSystem{D}
    particles::AbstractVector{S}
    cell::TCELL
    data::Dict{Symbol, Any}  # Store arbitrary data about the atom.
end

or this:

struct FlexibleSystem{D, S,V<:AbstractVector{S}, TCELL} <: AbstractSystem{D}
    particles::V
    cell::TCELL
    data::Dict{Symbol, Any}  # Store arbitrary data about the atom.
end

I feel like using an AbstractVector type for a struct field will lead to less efficient code.

I guess that for the second version you mean

struct FlexibleSystem{D, S, V <: AbstractVector{S}, TCELL} <: AbstractSystem{D}
    particles::V
    cell::TCELL
    data::Dict{Symbol, Any}  # Store arbitrary data about the atom.
end

That’s usually better because then the compiler knows the exact type of particles. See the manual.

2 Likes

So why would you ever do the first option; why is it even allowed if it’s less performant?

And no I don’t mean to use your modification. Can you explain why your modification is the right way to do it?

There are cases where you don’t know the exact type ahead of time. (Think of a mutable struct.) Also, the first version might be better if you want to put many objects with different subtypes of AbstractVector{S} into a Vector or some other collection. (I haven’t checked this.)

Can you explain why your modification is the right way to do it?

As far as I can tell, your second version has an undefined type V. And what would be the purpose of having AbstractVector{S} in the type parameters?

It is allowed because Julia is / aims to be very flexible.

Here, putting V in the type parameters will lead to more compiled code if you use different concrete types V for many different structs constructed, for example StaticArrays’ SVector of a lot of different sizes.

On the other hand, only one “version” of the struct will be instantiated and methods taking your struct as argument won’t compile new method for each concrete type of the array inside, although that method will be less performant.

So it’s a tradeoff, and both will make sense is certain situations..

1 Like

I see why your modification is right. I hadn’t noticed that V wasn’t defined in the type signature.

I just always feel lost when it comes to parametric types and it seems like an important issue when it comes to performance.

It seems that not matter how the struct is defined, as long as concrete types are used upon instantiating, the performance should be optimal. The parameters are just for creating families of types. Is that the right way to think? Or can someone show me a parametric type that is inefficient no matter what.

Yes, you want to have concrete types for best performance. This applies recursively. For example, the type Dict{Symbol,Any} that you use is concrete, but replacing Any by a concrete type would be preferable (if possible).

The code you provided has invalid syntax, @matthias314 presumably extrapolated valid code by reasoning about your actual intention.

I second this. See the Performance Tips for a start.

Here are two examples from AtomsBase:

struct FastSystem{D, TCELL, L <: Unitful.Length, M <: Unitful.Mass, S} <: AbstractSystem{D}
    cell::TCELL
    position::Vector{SVector{D, L}}
    species::Vector{S}
    mass::Vector{M}
end

and

struct FlexibleSystem{D, S, TCELL} <: AbstractSystem{D}
    particles::AbstractVector{S}
    cell::TCELL
    data::Dict{Symbol, Any}  # Store arbitrary data about the atom.
end

from the names I gather that one structure is more flexible, and possibly less efficient/performant, than the other. Is this because the type of particles is Abstract, and therefore it’s type can be changed?