Is there a Julian way to combine types generically or a design pattern that achieves the goal?

question

#1

The general idea is that I often have types that are composed of smaller “pieces” of the types. Not subtypes in the sense of a type tree, rather partial-types in the sense of records with only a fraction of the next type’s information.

For example suppose I were trying to “build” an EmployeeRecord. I might proceed as follows. Note, this is very contrived, but in the actual use case (which is too complicated to transform into a MWE) there’s a better reason for building up the big combined types from the smaller partial types.

immutable Person
    name::String
    social::Int64
end

immutable Address
    number::Int64
    street::String
    city::String
    state::String
end

immutable PersonAndAddress
    person::Person
    address::Address
end

immutable EmployeeRole
    role::String
    seniority_level::Int64
end

immutable EmployeeRecord
    person_and_address::PersonAndAddress
    role::EmployeeRole
    years_at_company::Int64
    salary::Float64
end

Now suppose I have some function that operates on the name field of Person. Again, this is contrived, but bear with me:

function greet(x::Person)
    println("Hi, $(x.name)")
end

But I have other functions that also need to make use of the greet capability, yet they might not be operating on Person, but on one of the more built-up types. So say there’s some function like

function writeletter(x::PersonAndAddress)
    greet(x)
    println("How are you? Sincerely, Julia")
    mail_stdout_to(x.address)
end

Then I need to provide some wrapper function, either

function greet(x::PersonAndAddress)
    println("Hi, $(x.person.name)")
end

or

function greet(x::PersonAndAddress)
    greet(x.person)
end

and now imagine this cascaded up a few layers, so e.g. for EmployeeRecord:

function greet(x::EmployeeRecord)
    greet(x.person_and_address) #assuming the greet above was provided
end

and so on for even bigger smashings-together of types.

Hopefully the problem is clear by now: as the combinations of types get bigger and bigger, the amount of overhead to keep the methods all in order gets insane.

But there’s a good reason for having the distinct partial-types, namely that I can write generic methods that only need the information contained in those partial-types, and also in the case of my example the lower-down partial-types tend to have information that helps build the higher-up partial types.

In Java or another less elegant language, I might model this situation with something like a chain of abstract types with fields with methods coded against the abstract types and then a corresponding concrete type for each level of abstraction. That way the additional fields even at the most built-up type would have the fields of the lowest-down partial type accessible at the same “level,” i.e. without having to dig deeper into the record-of-record-of-records to get at the underlying data…everything would always be one . down.

It seems I might use Traits/interfaces where each type above “has-a” name, though I’m not sure how I can elegantly or concisely implement get_name for each consecutive wrapper type. (The abstract field with types solution seems like it would handle that though.)

Is there a way to do this in Julia? Possibly with some macros? All I’ve come up with so far is a sketch of a macro that would essentially examine the fieldnames of two types, construct code that defines a new type that contains all the field names (I’m not worried about clashes at the moment; I can code around that), and then define the type and then make an object of that type, but it seems dangerous to me to build a codebase atop using macros to define types on the fly that I haven’t actually explicitly specified. That is, since I’m coding methods targeted to many levels of built-up-ness, I’d much prefer to write explicitly here are the various levels (in the example above, Person, PersonAndAddress, EmployeeRecord, etc.), but to be able to count on the idea that when I write x.y in a method (or even a more generic get_from(x,y) or something) that when a future x wraps the x the method actually expected in some number of layers of nesting, it will “just work.”

Note, in the example above, the graph of partially-built-up types has more than one root following “de-nesting.” For example you could de-nest EmployeeRecord into either PersonAndAddress or EmployeeRole and then those into separate composite types or primitives. But I would also be interested in an answer that only worked on trees with a single de-nesting path, e.g. where for every type T_N, the pattern looked something like

immutable T_NPlusOne
    nested::T
    additional_info::Float64OrSomeOtherPrimitive
end
immutable T_NPlusTwo
    nested::T_NPlusOne
    more_additiona_info::Float64OrSomeOtherPrimitive
end

, on the off-chance the original question is too difficult but that restriction is an answerable sub-goal of the original question.

Is there a trick to this? Or a design pattern that I can follow that achieves effectively the same thing? Thanks!

PS, I tried to ask @jeff.bezanson and @StefanKarpinski about this at Julia Day NY with admittedly limited success explaining what I meant. I hope this is a little more clear.

PPS This question is loosely related to ideas in the NamedTuples.jl package, to Traits in general including this and this, and to this question about coupling types (although my meaning of “combine” is not that similar to @kellertuer’s usage of “couple”).


#2

Since I don’t think it was explicitly linked, I’ll point out @MikeInnes’s very relevant post in a recent and related discussion: Add type mixins to overcome subtype limitations


#3

Thanks; I missed that and indeed it and its related links are useful food for thought. I don’t see a conclusion though. E.g. something of the form

  1. This can be done, just do: _____
  2. This can’t strictly speaking be done, but the accepted best practice for doing this right now is this somewhat clunky but functional workaround: _____
  3. Don’t do this, and here’s why: ____

But since enough people are thinking about this, presumably there are enough hacked up examples of solutions out there that someone could at least suggest a “yes, had same problem, here’s how I handled it” example?

Thanks!


#4

Add to my initial post: perhaps following the “only one path of nesting” restriction and using @StefanKarpinski’s suggestion here would imply a partial solution whereby each next layer of packing is done by defining an abstract type and its interface, where the abstract types nest in something like AT1 <: AT2 <: AT3 <: ... and each concrete type breaks off as such:

AT1 <: AT2 <: AT3 
 |      |      |
CT3    CT2    CT1

(vertical | also indicates <: but where the LHS is a concrete type and RHS is abstract).

E.g. CT1 has the fewest fields and thus subtypes the broadest abstract type. CT2 is-a AT2 and therefore is-a AT3, and therefore has all the fields that CT1 has plus whatever else it has, except that the has-a relationships are encoded as functions on the ATs.

For example

@auto_generate abstract AT3
    field1::Float64
    field2::Int64
end
#would turn into something like

abstract AT3
function get_field1(x::AT3)
    x.field1
end #or similar
function get_field2(x::AT3)
    x.field2
end #or similar

#So then given
immutable CT1 <: AT3
    field1::Float64
    field2::Int64
end
#We'd already have effectively function get_field1(x::CT1) = x.field1

#Then we'd also need something like

@auto_generate_nested abstract AT2 <: AT3
    at3::AT3
end
#More magic here...somehow this does something like
function get_field1(x::AT2)
    get_field1(x.[get first field subtyping AT3])
end

#So then 
immutable CT2 <: AT2
    ct1::CT1
    field3::String
end
#Would already have function get_field1(x::CT2) = get_field1(x.ct1)
#and so on for CT3, AT1, etc.

Realize this is quite clunky (and skips issues of well-defined-ness in implementation and also of ensuring consistency among field types…though I think these could both be handled by restricting the functionality sufficiently); it’s certainly not what I think should prevail. Just trying to push the discussion forward.


#5

Well, I think the succinct conclusion is “don’t do that, if you can avoid it.” Beyond that, I’m not really sure which of these ?s you are asking:

  • how to repeat common, shared fields
  • how to generate accessors / getters automatically

But right now I think the answer to both is to use macros. There are a number of instructive examples in the issue you linked.

(And perhaps the best answer I can give is that it sounds like you might need a database…)


#6

Cannot avoid it, and a DB is definitely not the answer; these need to be performant in-memory types with complicated shared fields; the record example was just for ease of reading but these are not really records.

I’m asking both questions, and I’m additionally asking “how to do each of these in a consistent, stable, easy-to-reason-about way,” which I presume means “and without resorting to a bunch of messy macros.” I accept that there may not be a solution that satisfies all those constraints, though I think it’d help to hear more suggestions regarding how people get close.

The only example I could see in the GitHub issue I linked is this two-year-old suggestion that seems to do only a part of this and only in some ways…was thinking there might be other things worth studying as examples.

Thanks!


#7

Unless you programmatically create types in a scheme like this, there isn’t a mechanism for this yet that can give you compile time dispatch. You could implement runtime dispatch using a dictionary though.


#8

Incidentally I do that for a container for functions that needs the same type of generality (i.e. other methods should be able to look into it for the functions they need, without worrying about whether there are or aren’t additional functions also defined for other concepts), and the next question was going to be extending this solution to sets of functions. Well, it looks like I need to spend some time thinking about a sensible way to implement the programmatic type creation.


#9

Perhaps you saw this in one of the linked threads and didn’t find it a good solution for some other reason, but FWIW in this case I would just do:

type Person # ...

name(p::Person) = p.name
setname!(p::Person, name) = (p.name = name)
greet(p::Person) = println("Hello, $(name(p))")

type PersonAndAddress
  person::Person
  # ...

@forward PersonAndAddress.person name, setname!, greet

type Employee # ...

@forward Employee.person_and_address name, setname!, greet
@forward Employee.seniority_level seniority, setseniority! # etc

@forward isn’t doing anything particularly crazy or magical here, just defining the set of methods like greet(p::PersonAndAddress, args...) = greet(p.person, args...). I’ve been pretty happy with this solution and haven’t found a great need for anything fancier, but if you were really working with a lot of functions you could easily generalise this to groups, e.g. (heavily cut down for brevity)

personprotocol = [name, setname!, greet]

@forward Employee.person_and_address personprotocol

employeeprotocol = [personprotocol, fire!]

@forward SeniorEmployee.employee employeeprotocol

I think that would would be enough to make your example – and more extreme versions of it – manageable, but let me know if that’s not the case.


Workaround for traditional inheritance features in object-oriented languages
#10

This is another interesting suggestion; thanks. Let me spend some time playing with this and some of the other suggestions and I’ll report back. These suggestions are helping to see some points on the efficient frontier in the less-boilerplate / less-programatically-generated-code tradeoff space.


#11

So I have a very similar issue as the OP and I think the @forward macro would help me - but where is that defined? Thanks


#12

Good question – it’s exported by Lazy.jl. The definition is also pretty simple so you could just copy-paste it wherever and avoid Lazy.jl itself.

I hadn’t particularly thought about using @forward with “protocols” before now but I wonder if that’s an interesting path towards more advanced features. Defining a protocol as a group of related methods, explicitly forwarding it on composite types and getting a small sprinkle of Holy Trait goodness to go along with it may be a simple way to get the more advanced OO-like functionality we want. Not sure yet though.


#13

Here’s a stab at combining some of the suggestions here into one implementation of a limited way of programmatically combining types.

Specifically, I’m going to provide an implementation of the following notion of combination or summation of types:

For any types P and Q with fields P_1::P_1T,...,P_n::P_nT and Q_1::Q_1T,...,Q_m::Q_mT, the macro @generate_sum_type P Q PQ will define a type PQ with fields PQ_1::PQ_1T,...,PQ_{f(n,m)}::PQ_{f(n,m)}T, with the following properties:

  1. If X_i is unique in {P_i} \union {Q_i}, then there is a unique k such that PQ_k = X_iand PQ_kT = X_iT
  2. If there exist i,j such that P_i = Q_j, and if typeintersect(P_iT,Q_jT)!=Union{}, then there is a unique k such that PQ_k = P_i = Q_j and PQ_kT = typeintersect(P_iT,Q_jT).
  3. PQ can only be instantiated through a keyword constructor with argument names intersect({P_i},{Q_j})

If typeintersect(P_iT,Q_jT)==Union{} for any i,j with P_i=Q_j, then no summation of P and Q exists; they are incompatible.

I decided on the keyword restriction
because I want every sum-type to be invariant to the under the ordering of the fields of the underlying types and the order of the underlying types themselves. In other words if P,Q are such that PQ=P+Q exists, then this definition ensures that P+Q=Q+P.

(Another alternative would be to alphabetize the fields, or impose any other restriction to a unique ordering of fields.)

This solution borrows heavily from @mauro3’s Parameters.jl for the the keyword-constructor creation.

import Base: @__doc__
import DataStructures: OrderedDict
using Iterators


const err1str = "Field \'"
const err2str = "\' has no default, supply it with keyword."

function generate_sum_type(type1,type2,new_type_name)
    names1      = fieldnames(eval(type1))
    names2      = fieldnames(eval(type2))
    types1      = eval(type1).types
    types2      = eval(type2).types
    names_types = Dict{Symbol,Type}()
    for (name,typ) in zip(chain(names1,names2),chain(types1,types2))
        if !haskey(names_types,name)
            names_types[name] = typ
        else
            type_intersection = typeintersect(names_types[name],typ)
            if type_intersection != Union{}
                names_types[name] = type_intersection
            else
                error("Types share a field name with empty type intersection.")
            end
        end
    end
#
    #Build the field names and types from the two types, and combine
    typedef   = Expr(:type,Any[false,Symbol(new_type_name)]...,Any)
    fielddefs = Expr(:block,Any[],Any)
    kws       = OrderedDict{Any, Any}()
    for (fieldname,fieldtype) in collect(names_types)
        push!(fielddefs.args, :($fieldname::$fieldtype))
        kws[Symbol(fieldname)] = :(error($err1str * $fieldname * $err2str))
    end
    typ = Expr(:type, deepcopy(typedef.args[1:2])..., deepcopy(fielddefs),Any)
#
    #Build and insert inner keyword-constructors
    args   = Any[]
    kwargs = Expr(:parameters)
    for (k,w) in kws
        push!(args, k)
        push!(kwargs.args, Expr(:kw,k,w))
    end
    #Only include the kw constructor
    tn = Symbol(new_type_name)
    push!(typ.args[3].args, :($tn($kwargs) = new($(args...)) ))
#
    quote
        Base.@__doc__ $typ
        $tn
    end
end
macro generate_sum_type(type1,type2,new_type_name)
    return esc(generate_sum_type(type1,type2,new_type_name))
end

immutable Type1
    field1::Real
end
immutable Type2
    field2::Int64
end
@generate_sum_type Type1 Type2 Type1And2

x = Type1And2(;field1=1,field2=2)
x.field1
x.field2

#Should error
x = Type1And2(1,2)

immutable Type3
    field1::Float64
    field3::Float64
end
@generate_sum_type Type1And2 Type3 Type123

x = Type123(;field1=1.0,field2=2,field3=3.0)
x.field1
x.field2
x.field3

immutable Type4
    field2::Float64
    field4::Float64
end

#Should error
@generate_sum_type Type123 Type4 Type1234

#Cool.

julia> immutable Type1
           field1::Real
       end

julia> immutable Type2
           field2::Int64
       end

julia> @generate_sum_type Type1 Type2 Type1And2
Type1And2

julia> x = Type1And2(;field1=1,field2=2)
Type1And2(2,1)

julia> x.field1
1

julia> x.field2
2

julia> #Should error

julia> x = Type1And2(1,2)
ERROR: MethodError: no method matching Type1And2(::Int64, ::Int64)
Closest candidates are:
  Type1And2{T}(::Any) at sysimg.jl:53

julia> immutable Type3
           field1::Float64
           field3::Float64
       end

julia> @generate_sum_type Type1And2 Type3 Type123
Type123

julia> x = Type123(;field1=1.0,field2=2,field3=3.0)
Type123(2,1.0,3.0)

julia> x.field1
1.0

julia> x.field2
2

julia> x.field3
3.0

julia> immutable Type4
           field2::Float64
           field4::Float64
       end

julia> #Should error

julia> @generate_sum_type Type123 Type4 Type1234
ERROR: Types share a field name with empty type intersection.

#14

you could also check https://github.com/tbreloff/ConcreteAbstractions.jl


#15

OK yes, based on the readme, this is exactly an example of something that might be relevant. Thanks. And thanks also to @tbreloff for yet another interesting contribution to the Julia ecosystem.


#16

@MikeInnes Quick question - does the forward macro allow dispatching on Abstract types (say all concrete types inheriting from the abstract type inherits the field in question)?


#17

Update: yes it does :slight_smile: Sorry for the premature ping - I had like 20 functions that needed to be forwarded on 6 different types so I thought it would take to long to just do it and try. But it didn’t :slight_smile: