How to avoid containers of abstract parametric types?

AUK1939 · May 22, 2024, 4:15pm

I have defined a set of structs representing various financial instruments

abstract type Asset end

mutable struct Stock <: Asset
    symbol::String
    company::Int64
end

mutable struct Bond <: Asset
    symbol::String
    currency::String
end

mutable struct FxSpot <: Asset
    ccy_pair::String
    notional::Float64
end

A paremetric Trade type

abstract type Trade end

@enum LongShort Long Short

struct Trade{T <: Asset} <: Trade
    side::LongShort
    instrument::T
    quantity::Float64  
    price::Float64
    trade_date::DateTime
end

And a portfolio containing a list of trades

struct Portfolio
    trades::Vector{Trade} 
end

The problem is iteration over trades is very slow due to type instability. I lose the performance benefits of julia with this design.

This is a convenient design in that I can write things like

sum([price t for portfolio.trades])

But how can I do this in a more performant way and keep type stability? Without the need to create a seperate vector for each trade type which is cumbersome to maintain

struct Portfolio
    stock_trades::Vector{Trade{Stock}}
    bond_trades::Vector{Trade{Bond}}
    fx_trades::Vector{Trade{FxSpot}}
    # and all the other potential instrument types 
end

What is the julia way to accomplish this? Is there a good julia pattern available to avoid this?

gdalle · May 22, 2024, 5:23pm

I seem to remember SumTypes.jl being mentioned as a solution to this problem, but I have never investigated further

CameronBieganek · May 22, 2024, 7:22pm

The first question to ask is whether your code such as sum([t.price for t in portfolio.trades]) is performance critical or not. If it’s not, then this might be premature optimization.

If performance is critical, then StructArrays.jl might be one option to look at.

Along the lines of the SumTypes.jl recommendation from @gdalle, you can make the Trade type completely concrete by squishing all the Asset structs into the Trade struct:

@enum Asset Stock Bond FxSpot

struct Trade
    side::LongShort
    asset::Asset
    symbol::String
    company::Int64
    currency::String
    ccy_pair::String
    notional::Float64
    quantity::Float64  
    price::Float64
    trade_date::DateTime
end

Then you would need some branching logic at various places in your code, and default values for asset related fields when those fields are not applicable to the current instance. It’s not very pretty, but it gets the job done.

Otherwise, something like your Portfolio struct with Vector fields is not a bad way to go either.

AUK1939 · May 22, 2024, 7:27pm

Thanks, Yes this is performance critical

So far the two solutions are to use a wider class with an enum or use SumTypes.jl.

I’m stuggling to see how to implement the SumTypes.jl solution. Could you provide some hints?

Thanks

simsurace · May 22, 2024, 7:53pm

I wonder why your assets are mutable. Isn‘t an asset with different fields a different asset? Depending on what you do in your loop over assets, mutability could have a non-negligible cost.

AUK1939 · May 22, 2024, 8:04pm

Good point, I’ll see if I can refactor this, but I think the main bottleneck right now is type instability in the solution

sgaure · May 22, 2024, 8:35pm

Have you tried defining Trade with instrument::Asset instead of with a parameter?
I tried a small example, with a sum like you suggested. It ran about 40 times faster than with the parametrized Trade. However, this could be because there are only three subtypes of Asset.

j-fu · May 22, 2024, 8:36pm

You can try to leverage union splitting. Define a union type of all possible trades and use this to declare your vector.
Here is an example:

using BenchmarkTools

N=100
struct A{T}
    a::T
end

function myfill!(X)
    for i=1:length(X)
        if i%3==0
            X[i]=A(Int16(i))
        elseif i%3==1
            X[i]=A(Int32(i))
        elseif i%3==2
            X[i]=A(Int64(i))
        end
    end
end

function mysum(X)
    s::Int64=0
    for x in X
        s+=x.a
    end
    s
end

VA=Vector{A}(undef,N)
myfill!(VA)
@show mysum(VA)
@btime mysum(VA)

const U=Union{A{Int16}, A{Int32}, A{Int64}}
VU=Vector{U}(undef,N)
myfill!(VU)
@show mysum(VU)
@btime mysum(VU)

It gives

mysum(VA) = 5050
3.109 μs (138 allocations: 2.16 KiB)
mysum(VU) = 5050
106.416 ns (1 allocation: 16 bytes)

I wrote about union splitting here.

Tortar · May 22, 2024, 9:18pm

you can also use MixedStructTypes.jl (soon it will be renamed DynamicSumTypes because I think it’s a better name) which should have the same performance of SumTypes.jl because it is based on that (or sometimes even a bit more because you have also another option which I explain below) but it’s very simple to work with, e.g. in your case:

using Dates, MixedStructTypes

abstract type AbstractAsset end

@sum_structs Asset <: AbstractAsset begin
	mutable struct Stock
	    symbol::String
	    company::Int64
	end
	mutable struct Bond
	    symbol::String
	    currency::String
	end
	mutable struct FxSpot
	    ccy_pair::String
	    notional::Float64
	end
end

abstract type AbstractTrade end

@enum LongShort Long Short

struct Trade <: AbstractTrade
    side::LongShort
    instrument::Asset
    quantity::Float64  
    price::Float64
    trade_date::DateTime
end

you can also use @compact_structs in the same way which implements internally instead the strategy @CameronBieganek mentioned. You work with them almost as with any other Julia type in the sense that you can dispatch on any signature containing them with the help of a macro, e.g. you can do:

@dispatch price(::Stock) = 1000
@dispatch price(::Bond) = 2000
@dispatch price(::FxSpot) = 3000

this avoids static pattern-matching. So that you can now simply do sum(price(t.instrument) for t in portfolio.trades).

The only limitation of this approach is that you can’t dispatch on types enclosing the different variants: for example you can only dispatch on Vector{Asset} not on Vector{Bond}, because in all of this there is only really a single type. But this isn’t a problem usually, because you are operating on heterogeneous containers anyway. The package has been already leveraged in Agents.jl so it is proved useful somewhere. But I’m the author of the package so I could be biased

AUK1939 · May 23, 2024, 2:27am

@Tortar This is a very interesting approach.

I was wondering if it can handle the following situation. Add to the mix, a equity or stock option which is a type of AbstractAsset but a little more specialized.

abstract type AbstractOption <: AbstractAsset end

mutable struct EquityOption <: AbstractOption
        type::CallPut
        symbol::String
        expiry_date::DateTime
        strike::Float64
end

Obviously I need special handling for options. Some functions are dispatched on just AbstractOption and some on EquityOption. I was wondering if this situation can be handled with @sum_structs? I don’t want to define all my assets as AbstractAsset, In some cases I want to be more specific so I can dispatch on a more specific abstract type.

mkitti · May 23, 2024, 4:22am

Besides a sum type, you could create a custom generic type.

mutable struct GenericAsset{T} <: Asset
    identifier_name::Symbol
    atrribute_name::Symbol
    identifier::String   
    attribute::T
end

function Base.convert(
    ::Type{GenericAsset{T}},
    asset::A
) where {T, A <: Asset}
    GenericAsset{T}(
        fieldnames(A)...,
        getfield.((asset,), 1:fieldcount(A))...
    )
end

Now you can convert any Asset to a GenericAsset{Any}.

julia> tesla = Stock("TSLA", 420)
Stock("TSLA", 420)

julia> thirty_year_usa = Bond("^TYX", "USD")
Bond("^TYX", "USD")

julia> usd_eur = FxSpot("USD/EUR", 1100.5)
FxSpot("USD/EUR", 1100.5)

julia> assets = GenericAsset{Any}[tesla, thirty_year_usa, usd_eur]
3-element Vector{GenericAsset{Any}}:
 GenericAsset{Any}(:symbol, :company, "TSLA", 420)
 GenericAsset{Any}(:symbol, :currency, "^TYX", "USD")
 GenericAsset{Any}(:ccy_pair, :notional, "USD/EUR", 1100.5)

GenericAsset{Any} is a concrete type with an abstract field. It has a concrete identifier field which will always be String.

DNF · May 23, 2024, 5:13am

Also notice the lack of square brackets in the last expression, @AUK1939. The brackets indicate an array comprehension, so first an array is created and afterwards it is summed. That’s redundant. Without the brackets, the sum is calculated directly with no redundant temporary array.

I don’t know if it makes much difference here, but you did say it’s performance critical.

Tortar · May 23, 2024, 3:14pm

Obviously I need special handling for options. Some functions are dispatched on just AbstractOption and some on EquityOption. I was wondering if this situation can be handled with @sum_structs? I don’t want to define all my assets as AbstractAsset, In some cases I want to be more specific so I can dispatch on a more specific abstract type.

I can think of mainly two strategies:

Add to the mix another type, always defined with @sum_structs, such that it contains only different kind of options, this is okay for performance with up to three types in the same container as usual.
Just don’t use abstract types to dispatch but only the concrete implementations, this should fine if you don’t need to handle containers with only a specialized type of asset, and if this happens not too frequently you can always check instead if all instances are of a certain type.

Topic		Replies	Views
Iterating structs of different types efficiently General Usage type-stability	15	711	August 4, 2023
Looping over different types with common behavior Performance	9	1043	June 30, 2018
Avoiding Vectors of Abstract Types Performance question , type-stability	22	4426	February 17, 2022
[Ann] LightSumTypes.jl - Combine multiple types in a single one Package Announcements package , announcement , macros , struct	5	1257	July 6, 2024
Nested Parametric Types and Collections General Usage parametric-types	10	490	June 16, 2022

How to avoid containers of abstract parametric types?

Related topics