Composition and inheritance: the Julian way

Ok, I see the point. Thanks for clarifying.

In summary, if we want to extend/customize a structure such as:

struct Person <: AbstractPerson
    name::String
    age::Int
end

and use the new structure (call it Citizen) in exactly the same way as we use a Person we have two options:

  • composition:
struct Citizen
    person::person 
    nationality::String
end

In this case we are not tied to the Person layout, but we have to re-define all the methods accepting a Person object. This process can be automatized using macros but we still get lots of further entries in the dispatch table.

  • inheritance:
abstract type AbstractCitizen <: AbstractPerson end
mutable struct Citizen <: AbstractCitizen
    name::String
    age::Int
    nationality::String
end

In this case we are tied to the Person layout, but we can use one of the many available macros (e.g. the @def in @ChrisRackauckas examples) to completely lift this dependency.
Moreover all the methods accepting a Person will work seamlessly without flooding the dispatch table.

To solve the problem presented in this post, is there any valid reason or real use case to choose composition over inheritance ? If any, please post a short example :wink:

Not at all. There is no inheritance per se in Julia, so you can also do

mutable struct Citizen <: AbstractCitizen
    age::Int     # note order
    name::String
    nationality::String
end

and it should not matter if you are using slot names (and using slot positions would be silly). That said, while inheritance can be emulated, it is usually not the idiomatic approach in Julia, so it is misleading to treat it as an alternative to composition, just like using a Dict should not be considered a serious “alternative” either.

The point here is to distinguish the two approach. I keep making the mistake of attaching name to them, and I am immediately corrected. This is OK, but please could we focus on the two approach, regardless of their names?

To solve the problem presented in this post, is there any valid reason or real use case to choose approach 1 (mistakenly called composition) over approach 2 (mistakenly called inheritance).

It is not mistakenly called composition, it is composition. The reason to use it is that it is the idiomatic solution for Julia, as explained by @ChrisRackauckas above.

I will definitely do it after I am convinced that it is worth the time. I am asking here for help because I may not be aware of all the possible consequences.

If we all agree there is nothing wrong in repeating struct fields in structures like Citizen, and that the rules outlined above are the best way to implement an interface I will surely do.

If the community approves an approach I believe it is easier for the PR to be accepted… :wink:

No, you just choose too simple of a problem. The moment you go one step higher though it’s clear what happens. Let’s say you want to extend an array type to have metadata, like how DEDataArray does. There’s two ways to do it. One way to do it is to do composition.

type MyDataArray{T,N,A} <: DEDataArray{T,N}
    x::A
    a::T
    b::Symbol
end

(homework: fix my triangular dispatch). Now just forward the array interface onto x and you’re good.

Inheritance…?

julia> fieldnames([1,2,3])
0-element Array{Symbol,1}

Arrays are primitives in Julia so you can’t access their data… so haha this didn’t work out to well.

Now let’s say we want to do this with a sparse matrix. Composition already works with a sparse matrix. For inheritance, you’d have to add in these fields:

julia> fieldnames(sprand(10,10,0.1))
5-element Array{Symbol,1}:
 :m
 :n
 :colptr
 :rowval
 :nzval

and do a few overrides to make it act just like a SparseMatrixCSC (and make it an AbstractSparseMatrix, let’s assume that has enough generic methods to work easily) but with metadata. Okay, so extra work but still doable.

But what about if you wanted a BandedMatrix?. Well, this DEDataArray package code via an interface with composition already works because it still doesn’t care about the underlying data representation of the x field. For the inheritance way, you’d have to make a new type and add in the fields of a banded matrix and add some overrides.

So let’s see the tally.

Composition: 1 type, 1 set of overrides (inherited from a package so user’s don’t have to do it).

Inheritance: 1 new type each time you want to use a new matrix type (since it needs the structure of your new matrix), this doesn’t work with arrays (so it kind of defeats the purpose because the “simplest” case doesn’t work), and the user has to do the dirty details of forwarding array implementations into the type definitions.

The problem with inheritance is “array with metadata” is an abstract idea that doesn’t care that a sparse matrix is implemented by rowval with colptr meaning how many values per column to point to data stored in nzval. Those are completely unnecessary details that inheritance formulations have to pull in when doing an extension. However, DEDataArray essentially says “put the array that you want here, then put the metadata below it”. That works with any array type for obvious reasons, and if there’s a performance concern you can specialize some of the package functions as needed on certain classes of functions which you know have faster/slower access (again, not on exact implementation details, but on classes or abstractions of implementation… based on how they act!). DEDataArray doesn’t actually need an array in there. If you created a type like the Strang from SpecialMatrices.jl then this will forward the actions so it still acts like a matrix, but with metadata. It really doesn’t care what you put there, unless it acts correctly, and neither does any code that uses it.

So yes, there can be some reasons for extensions if something really requires that the extender should have exactly the same data representation. However, I find that is more of a rarity than an exception, at least in numerical mathematics. You can always fight against this oncoming train, but the reason why people warn against over-use of inheritance is because if the two objects aren’t metaphysically required to have the same layout, then somewhere down the line engineer A will find a nicer/better/faster representation for the simpler form and break the extension.

2 Likes
struct decorated{basetype, decType}
parent::basetype 
decoration::decType
end

I don’t need to know the basetype at coding-time.

1 Like

Quite a while ago, I tried making SubDataFrame encapsulate an AbstractDataFrame. It worked fine, but I got an avalanche of method ambiguities. That’s handled better now, so it’d be great if someone gave this another shot. It’s the right approach.

I just want to clarify my suggested use of Mixers.jl, and when I would use it instead of aggregated composition.

I’m often working with multiple formulations of physiological processes that share a subset of parameters. The formulation is a method despatching on a type that holds the necessary parameters. But they don’t actually inherit any behaviours, they are just dispatched to run a particular version of a formulation, using some custom parameters, and some common parameters that represent the same physical properties - and have the same Parameters.jl defaults that I don’t want to duplicate.

They could be aggregated types but this would actually add non-existent interdependence between them, they would all need to access the same composed field in the method they dispatch on. It would also deepen the nesting, and the formulation methods would be harder to read. So I use mixins for those fields. It’s mostly for cleaning up inconsequential duplication, not organisation inheritance.

You could build concrete type inheritance with it as mixins can operate on mixins, and use holy traits for the dispatch hierarchy, which could even be automated. But I haven’t tried that, and it might be insane. But it would be more flexible than oop concrete type inheritance.

incidentally, while I am aware of the etymology, I am wondering how this comes across for newcomers to the community :wink:

2 Likes

I love it, I think he’s a patron saint of the Julia community, at least he is for my programming lately!
:pray::pray::pray:

1 Like

You don’t have to re-define all the methods accepting a Person object if you write those methods using functions instead of field access. You just have to define some basic functions

Changing your original example:

abstract type AbstractPerson end
# basic methods to define: name,age,set_age

#basic AbstractPerson
mutable struct Person <: AbstractPerson
    name::String
    age::Int 
end

# CONSTRUCTOR
function Person(name)
    return Person(name, 0)
end

#basic methods
name(a::Person) = a.name
age(a::Person) = a.age
set_age(a::Person,x::Integer) = (a.age = x; x)

# TYPE METHODS: always use `AbstractPerson` as input type...
import Base.display
function display(p::AbstractPerson)
    println("Person: ", name(p), " (age: ", age(p), ")")
end

function happybirthday(p::AbstractPerson)
    set_age(p, age(p) + 1)
    println(name(p), " is now ", age(p), " year(s) old")
end

function call(p::AbstractPerson)
    print_with_color(:red, uppercase(name(p)), "!")
end

#---------------------------------------------------------------------
# DERIVED TYPE : Citizen

# Use abstract type for the interface name, by convention prepend
# `Abstract` to the type name.
abstract type AbstractCitizen <: AbstractPerson end
# here you should think of basic methods for a AbstrctCitizen, such as nationality(c::AbstractCitizen).

# TYPE MEMBERS (composition of `Person` fields and new ones)
mutable struct Citizen <: AbstractCitizen
    person::Person
    nationality::String # new field (not present in Person)
end

#now would be a good time to use macros...
name(c::Citizen) = name(c.person)
age(c::Citizen) = age(c.person)
set_age(c::Citizen,x::Int) = set_age(c.person, x)

#basic abstractcitizen method
nationality(c::Citizen) = c.nationality

#Now everything defined for AbstractPerson should work for Citizen

#And you are not tied to field names anymore:

struct EternalBeing <: AbstractPerson end

name(e::EternalBeing) = "The One who Is"
age(e::EternalBeing) = typemax(Int)
set_age(e::EternalBeing,x) = nothing

const eternal = EternalBeing()

# All just work
display(eternal)
happybirthday(eternal)
call(eternal)
3 Likes

I think all that can be simplified a bit, if every type that uses Person and wishes to use it’s functions, simply had a person function.

Try the following:

abstract type AbstractPerson end
# basic methods to define: name,age,set_age

#basic AbstractPerson
mutable struct Person <: AbstractPerson
    name::String
    age::Int 
end

person(a::Person) = a

# CONSTRUCTOR
Person(name) = Person(name, 0)

#basic methods
name(a::AbstractPerson) = person(a).name
age(a::AbstractPerson)  = person(a).age
set_age(a::AbstractPerson,x::Integer) = (person(a).age = x; x)

# TYPE METHODS: always use `AbstractPerson` as input type...
Base.display(p::AbstractPerson) = println("Person: ", name(p), " (age: ", age(p), ")")

function happybirthday(p::AbstractPerson)
    set_age(p, age(p) + 1)
    println(name(p), " is now ", age(p), " year(s) old")
end

call(p::AbstractPerson) = (print_with_color(:red, uppercase(name(p)), "!"); println())

#---------------------------------------------------------------------
# DERIVED TYPE : Citizen

# Use abstract type for the interface name, by convention prepend
# `Abstract` to the type name.
abstract type AbstractCitizen <: AbstractPerson end

# here you should think of basic methods for a AbstractCitizen,
# such as nationality(c::AbstractCitizen).

# TYPE MEMBERS (composition of `Person` fields and new ones)
mutable struct Citizen <: AbstractCitizen
    person::Person
    nationality::String # new field (not present in Person)
end

person(c::Citizen) = c.person

#basic abstractcitizen method
nationality(c::Citizen) = c.nationality

#Now everything defined for AbstractPerson should work for Citizen

#And you are not tied to field names anymore:

struct EternalBeing <: AbstractPerson end

name(e::EternalBeing) = "The One who Is"
age(e::EternalBeing) = typemax(Int)
set_age(e::EternalBeing, x::Integer) = nothing

const eternal = EternalBeing()

# All just work
display(eternal)
happybirthday(eternal)
call(eternal)

println()

zulima = Citizen(Person("Zulima Martín García", 44), "Spain")
display(zulima)
happybirthday(zulima)
call(zulima)
12 Likes

To be fair though, your example is covered by getproperty overloading.

:+1: or getproperty overloading where field access now becomes a function call when necessary.

The key point here is the word just. For a simple example you’re definitely right. But for a real life application (such as wrapping a DataFrame object as we discussd above) this implies redefining ~200 methods. Of course a few of these (namely all those returning a DataFrame struct) have to be redefined anyway, but what remains is still quite a large number.

Of course this is feasible, either manually or through macros, but my first thought is that I should try to find an alternative method. This triggered all present discussion.

Yeah, clearly all what we say here applies only if a structure is involved.

Of course I don’t want to fight any train. I’m asking here because I recognize many here are as way more expert than me. Still: no one told me yes, you have to blindly forward those ~200 methods, and forget about inheritance.

Which is OK for me, but I didn’t saw this carved in stones, nor in the manual or in the discourse or anywhere else. Hence my doubt arose…

To answer @favba: in your proposal you are implementing the name, age and set_age for both Person and Citizen, which is exactly what I wanted to avoid (the famous ~200 methods…). Regardless of macro usage: the methods are always there: they do nothing, they flood the dispatch table and they have to be compiled.

2 Likes

Thanks @ScottPJones!!!
You’re solution is definitely a very interesting one, and I think it definitely solve the problem in the first discussion of this post. You’re great!! :+1::+1:

In brief, your solution uses composition of structures and avoid useless methods re-definition.
Clearly, as noted by @ChrisRackauckas, there are quite a lot of getproperty calls involved. However (correct me if I’m wrong) there are quite a lot of getproperty calls involved also in method forwarding isn’t it?

One apparently officially questionable way is discussed in https://discourse.julialang.org/t/getproperty-decorations-inheritance-in-0-7/11237.

No, you don’t have to. If your interface on the abstract type is done via accessor functions (or lazy properties), then the interface to implement is tiny and limited to defining those accessor functions. See something like the AbstractArray interface which takes

https://docs.julialang.org/en/v0.6.2/manual/interfaces/#man-interface-array-1

and then works in literally any function which takes arrays. This is because if you develop code for the abstract type based on the interface (and it doesn’t have to be an abstract type, for example the Iteration Interface is for anything which acts like an iterable) then any implementation of that interface will do. If all of Julia’s arrays can work with 5 methods plus some extra traits, then that shows there’s quite a lot you can do.

If your interface is 200+ methods, it doesn’t matter if it’s Julia or OOP, you’re doing it wrong!

Dataframes is a bad example because they never defined an interface for a Dataframe which is why it’s so hard to extend. If you want a table interface, look at DataStreams.jl or IterableTables.jl. These have only a handful of functions and any source/sink will then work as a table. For example, Dataframes implements the source while StatPlots.jl implements the sink, so those two packages work together. But DifferentialEquations.jl solutions are also a sink, so you can convert its solutions to Dataframes or use it just like a table due to the 3 methods implemented by David. So, 200 methods >> the 3 that are used in the real live actually working example :smile:.

Here’s the full code that makes every differential equation solution (ODE/SDE/DAE/DDE/jump/etc.) have a table interface and directly convert to DataFrames.

1 Like

Now the bad news: despite the very clever suggestion by @ScottPJones this solution can not be applied in general, since it works only if the object you wish to extend/customize already uses the person(a::Person) = a method internally, which doesn’t apply to DataFrame

1 Like

It seems that there are two tasks, then for DataFrames, if you were to create a new type for metadata (which you don’t have to do! maintainers seem amenable to metadata living in DataFrames)

First, make sure DataFrames methods only take in AbstractDataFrame. Second, categorize which functions absolutely need to be defined toe AbstractDataFrame to work. Maybe there already exists a list somewhere? Or we could work on getting that list together and shortened. A good start would be to create a new MyDataFrame <: AbstractDataFrame type that is a mirror image of a DataFrame type and seeing how that works.