[Ann] MixedStructTypes.jl - Combine multiple types in a single one

I’m happy to announce MixedStructTypes.jl which allows to compactify multiple mutable and immutable types in a single one with a similar syntax to the one of structs. They work identical to structs for many operations because many Base functions are implemented to match the interface for normal structs. The point of working with a unique type instead of many is to avoid dynamic dispatch and abstract containers which have big performance hits.

Two different macros are available, having different memory and speed performance characteristics. The macro based on SumTypes.jl is also a bit more general because it allows to mix mutable and immutable structs where fields belonging to different structs can also have different types. Nonetheless, both already can contain parametric types and allow default values for fields. I think that the fact that the syntax is so similar to the one of structs should help integrate this package into other ones.

An example of usage and a little performance comparison between the two macros are already present in the ReadMe.

Let’s also explore here how do these two macros compare with the one from Unityper.jl. While these macros have also more features than Unityper because they allow for parametric mutable and immutable structs, while Unityper only allows for non-parametric immutable structs, they are also good performance-wise, to show that I repeat the benchmark on the ReadMe but with fewer types for the sake of brevity:

using MixedStructTypes, Unityper, BenchmarkTools

@compactify begin
    @abstract struct AT end
    struct A <: AT 
        a::Int = 1 
    end
    struct B <: AT 
        a::Int = 2
        b::Complex = 1 + 1im
    end
end

@compact_struct_type @kwdef CT begin
    struct C 
        a::Int = 1 
    end
    struct D 
        a::Int = 2
        b::Complex = 1 + 1im 
    end
end

@sum_struct_type @kwdef ET begin
    struct E 
        a::Int = 1 
    end
    struct F 
        a::Int = 2
        b::Complex = 1 + 1im 
    end
end

vec_a = AT[rand((A,B))() for _ in 1:10^6];
vec_c = CT[rand((C,D))() for _ in 1:10^6];
vec_e = ET[rand((E,F))() for _ in 1:10^6];

We look both to time and memory:

julia> @btime sum(x.a for x in $vec_a);
  937.911 ΞΌs (0 allocations: 0 bytes)

julia> @btime sum(x.a for x in $vec_c);
  715.359 ΞΌs (0 allocations: 0 bytes)

julia> @btime sum(x.a for x in $vec_e);
  3.936 ms (0 allocations: 0 bytes)

julia> Base.summarysize(vec_a)
35791632

julia> Base.summarysize(vec_c)
35487008

julia> Base.summarysize(vec_e)
12820240

As you can see, in this very simple (and so not too informative) benchmark @compact_sum_type is both faster and nearly as memory efficient, while @sum_struct_type is much more memory efficient that the other two macros.

For those interested, the package is already available in the general registry. Issues and PRs are really welcomed :slight_smile:

12 Likes

MixedStructTypes.jl reached version 0.2 with an improved syntax e.g. in the example above:

@compact_structs CT begin
    @kwdef struct C 
        a::Int = 1 
    end
    @kwdef struct D 
        a::Int = 2
        b::Complex = 1 + 1im 
    end
end

I also fixed and tested many edge cases, and made some new enhancements like the ability to constrain type parameters. Let me know if you have suggestions on new functionalities!

5 Likes

I released a new minor version of the package which now includes a feature I conceived some time ago: you can now dispatch on the variants of the type! I’m still working on making it possible to dispatch on the overall type when one wants a default function for a subset of the variants (or kinds in the gergo of the package), for now it is needed to specify a function for all kinds. But let me give a little example:

using MixedStructTypes
abstract type AbstractShape end
@sum_structs Shape <: AbstractShape begin
    @kwdef struct Square
        side::Float64 = rand()
    end   
    @kwdef struct Rectangle
        width::Float64 = rand()
        height::Float64 = rand()
    end
    @kwdef struct Triangle
        base::Float64 = rand()
        height::Float64 = rand()
    end
    @kwdef struct Circle
        radius::Float64 = rand()
    end
end
@dispatch area(s::Square) = s.side * s.side
@dispatch area(r::Rectangle) = r.width * r.height
@dispatch area(t::Triangle) = 1.0/2.0 * t.base * t.height
@dispatch area(c::Circle) = Ο€ * c.radius^2

here with the macro @dispatch we support the usual syntax for functions definitions, it is also good performance-wise:

julia> using BenchmarkTools

julia> count = 1_000_000;

julia> shapes = [rand((Square,Rectangle,Triangle,Circle))() for _ in 1:count];

julia> @benchmark sum(area(s) for s in $shapes)
BenchmarkTools.Trial: 966 samples with 1 evaluation.
 Range (min … max):  5.131 ms …  5.423 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     5.158 ms              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   5.171 ms Β± 37.811 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

    β–β–„β–†β–ˆβ–ˆβ–ˆβ–‡β–…β–„β–‚β–        ▁ ▁                                    
  β–„β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–…β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–ˆβ–‡β–†β–ˆβ–…β–ˆβ–†β–…β–†β–†β–…β–†β–„β–„β–…β–…β–…β–…β–„β–β–„β–„β–β–β–…β–…β–β–β–β–„β–„ β–ˆ
  5.13 ms      Histogram: log(frequency) by time     5.34 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

Before you would have needed to write

function area(sh)
    if kindof(sh) === :Square
        return area_square(sh)
    elseif kindof(sh) === :Rectangle
        return area_rectangle(sh)
    elseif kindof(sh) === :Triangle
        return area_triangle(sh)
    elseif kindof(sh) === :Circle
        return area_circle(sh)
    end
    error()
end
   
area_square(s) = s.side * s.side
area_rectangle(r) = r.width * r.height
area_triangle(t) = 1.0/2.0 * t.base * t.height
area_circle(c) = Ο€ * c.radius^2

and in some more advanced cases it would have been even more verbose to do it manually e.g. when you have multiple variants in the arguments.

Even if I’m still working on it, the macro is already fairly general, it can dispatch on parametrized variants, variants can be at any position in the arguments, supports keywords, etc…

3 Likes