Two structs with same fields - how?

The user interface for my package is as follows. There is a mutable struct called UserProblemDescription. It has ~30 fields, all of which are defined as unions with Missing:

Base.@kwdef mutable struct UserProblemDescription
    dimension1::Union{Int,Missing} = missing
    <etc>
end

The user creates one of these with an empty constructor, and then supplies the data one field at a time:

udp = UserProblemDescription()
udp.dimension1 = 9
<etc>

When the user is done, they invoke the main entry-point of my package. I would like the main entry-point to transfer all the data to a struct with the same field names but with fully concrete types instead of Union{...Missing} for clarity and type-stability:

struct InternalProblemDescription
    dimension1::Int
    <etc>

It should throw an error if any field is missing.

This seemingly requires a lot of code. The obvious line-by-line code is fragile because I have to repeat the entire list of fields correctly in three places in the code. Is there a package to carry out this operation or a simpler way to obtain the desired effect?

1 Like

This is probably of interest to you ANN: LazilyInitializedFields.jl - Handling lazily initialized fields in a consistent way

How about having the constructor take (nonoptional) keyword arguments? Or a dictionary, or a named tuple? That would make UserProblemDescription unnecessary.

Sidenote, a struct having that many fields seems fishy.

Could you unpack it into a NamedTuple for internal use? Like,

function to_named_tuple(x)
    return (;
        (
            n => getproperty(x, n)
            for n in propertynames(x)
        )...
    )
end

Constructing this will be type unstable, but once you pass it into the main body of your inner function, it will be concretely typed everywhere.

4 Likes

Consolidation is one of the main uses for metaprogramming. So here’s a related shot at the original query:

maybe_wrap(T, cond) = cond ? :(Union{$T,Missing}) : T
for (sn,c,) in (
    (:UserThing, true,),
    (:InternalThing, false,)
)
    e = Expr(:struct, c, sn,
            Expr(:block,
                Expr(Symbol("::"), :i, maybe_wrap(:Int, c)),
                Expr(Symbol("::"), :a, maybe_wrap(:Float64, c)),
            )
    )
    eval(e)
end

function convert(s::UserThing)
    InternalThing((getproperty(s,x) for x in fieldnames(UserThing))...)
end

The Boolian flag c is for mutability.

But if we step back, ISTM you basically have a table of name-type pairs. Perhaps the user-facing object need not live in the type system. If you put the table in a Dict instead, you could still use metaprogramming to create the internal struct but also generate more user-friendly validation checks and even allow for YAML or JSON input.

I think you should skip the UserProblemDescription as a struct entirely. It’s not doing you any favors. If the user doesn’t define every field and of a compatible type, you will just throw an error anyway. I’d accomplish what you ask with a Dict like this:

Base.@kwdef struct InternalProblemDescription
	dimension1::Int
	dimension2::Int
	dimension3::Int
	dimension4::Int
	bigness::Float64
	coolness::Float64
end

InternalProblemDescription(d::AbstractDict) = InternalProblemDescription(; d...) # dict->kw constructor

upd = Dict{Symbol, Any}() # can be more specific than `Any` (ie, `Union{Int, Float64}`) if relevant
upd[:dimension1] = 9
upd[:dimension2] = 3
upd[:dimension3] = 4
upd[:dimension4] = 2
upd[:bigness] = 10.0
upd[:coolness] = 1337.0

InternalProblemDescription(upd)
# InternalProblemDescription(9, 3, 4, 2, 10.0, 1337.0)

InternalProblemDescription(Dict()) # call with one or more fields missing
# ERROR: UndefKeywordError: keyword argument `dimension1` not assigned

But personally, I find “supplying the data one field at a time” to be very tedious. I’d probably just use the constructor directly. I’ll also remark that a struct with 30 fields could (should?) probably be organized more carefully. You should consider consolidating related groups of fields into sub-structs that are more understandable. For example, it sounds like you could use dimensions::NTuple{N, Int} rather than dimension1::Int, ..., dimensionN::Int. Maybe some other sets of fields should get their own struct definition entirely.

2 Likes

Thanks for all the proposed solutions! I adopted the solution posted by mikmoore. I did not previously know that a Base.@kwdef structure could accept a splatted dictionary as a constructor argument; this is a useful technique. I tried LazilyInitializedFields.jl, but it seemed to have the drawback that the user would have start the construction with the statement

upd = ProblemDescription(uninit, ... , uninit)

with the correct number of uninits, which did not seem as user-friendly as mikmoore’s solution.

As for some of the other comments: Making an interface that can also accept JSON seems out of reach right now because some of the fields in ProblemDescription are Julia functions, and I don’t know how to encode these as JSON.

As for the “tedium” of supplying fields one at a time, I agree that a single constructor call has some advantages. But separate statements for each field is helpful to the user who is juggling several problem descriptions concurrently and would like to cut and paste sections of one into the other.

For completeness, I’ll add that the splatting-discts thing is not unique to Base.@kwdef. Any keywords to any function can be set by splatting an AbstractDict or NamedTuple (and maybe a couple other types beyond those?) after the ; (which is mandatory for this, if I recall correctly, or at least if no other keywords are provided before it).