Implementing the builder pattern for types with UnionAll fields, idiomatic or not

My issue is essentially this: I have a type I would like to instantiate at some point with a validity check in an inner constructor, but before that I need to collect the data needed to actually build the type. I figured using the builder pattern commonly used by Rust packages might do the trick, even if it wasn’t the most efficient method ever.

The (rather massive) type that I am trying to build is defined as follows:

"""
The result of building an exercise.
"""
struct Tehtävä{
    Nimi         <: AbstractString,
    Tehtävänanto <: AbstractString,
    Ratkaisu     <: Union{Missing, <:AbstractString},
    Vastaus      <: Union{Missing, <:AbstractString},
    Avainsanat   <: Vector{<:AbstractString},
    Assignment   <: AbstractString,
    Solution     <: Union{Missing, <:AbstractString},
    Answer       <: Union{Missing, <:AbstractString},
    Keywords     <: Vector{<:AbstractString},
} <: AbstraktiTehtävä

    nimi            ::Nimi
    tehtävänanto    ::Tehtävänanto
    ratkaisu        ::Ratkaisu
    vastaus         ::Vastaus
    avainsanat      ::Avainsanat
    assignment      ::Assignment
    solution        ::Solution
    answer          ::Answer
    keywords        ::Keywords

    function Tehtävä{
        Nimi        ,
        Tehtävänanto,
        Ratkaisu    ,
        Vastaus     ,
        Avainsanat  ,
        Assignment  ,
        Solution    ,
        Answer      ,
        Keywords    ,
    }(
        nimi         ::Nimi        ,
        tehtävänanto ::Tehtävänanto,
        ratkaisu     ::Ratkaisu    ,
        vastaus      ::Vastaus     ,
        avainsanat   ::Avainsanat  ,
        assignment   ::Assignment  ,
        solution     ::Solution    ,
        answer       ::Answer      ,
        keywords     ::Keywords    ,
    ) where {
        Nimi         <: AbstractString,
        Tehtävänanto <: AbstractString,
        Ratkaisu     <: Union{Missing, <:AbstractString},
        Vastaus      <: Union{Missing, <:AbstractString},
        Avainsanat   <: Vector{<:AbstractString},
        Assignment   <: AbstractString,
        Solution     <: Union{Missing, <:AbstractString},
        Answer       <: Union{Missing, <:AbstractString},
        Keywords     <: Vector{<:AbstractString},
    }
        if isempty(nimi)
            error("Tehtävältä puuttuu nimi…")
        end
        if isempty(tehtävänanto) && isempty(assignment)
            error("Tehtävän $nimi tehtävänanto puuttuu…")
        end
        if isempty(avainsanat) && isempty(keywords)
            error("Tehtävän $nimi avainsanat puuttuvat…")
        else
            if true ∈ isempty.((sana for sana ∈ avainsanat))
                error("Jokin tehtävän $nimi avainsanoista oli tyhjä…")
            end
            if true ∈ isempty.((sana for sana ∈ keywords))
                error("Jokin tehtävän $nimi avainsanoista oli tyhjä…")
            end
        end
        new(
            nimi         ,
            tehtävänanto ,
            ratkaisu     ,
            vastaus      ,
            avainsanat   ,
            assignment   ,
            solution     ,
            answer       ,
            keywords     ,
        )
    end

    """
    Generates this type from a give type builder.
    """
    function Tehtävä(a::TehtävänArgumentit)
        Tehtävä(
            a.nimi,
            a.tehtävänanto, a.ratkaisu, a.vastaus, a.avainsanat,
            a.assignment, a.solution, a.answer, a.keywords
        )
    end
end

The builder is defined very similarly, except it does not take other builders as arguments:

"""
A builder used to generate the fields of an exercise during parsing.
"""
struct TehtävänArgumentit{
    Nimi         <: AbstractString,
    Tehtävänanto <: AbstractString,
    Ratkaisu     <: Union{Missing, <:AbstractString},
    Vastaus      <: Union{Missing, <:AbstractString},
    Avainsanat   <: Vector{<:AbstractString},
    Assignment   <: AbstractString,
    Solution     <: Union{Missing, <:AbstractString},
    Answer       <: Union{Missing, <:AbstractString},
    Keywords     <: Vector{<:AbstractString},
}
    nimi         ::Nimi
    tehtävänanto ::Tehtävänanto
    ratkaisu     ::Ratkaisu
    vastaus      ::Vastaus
    avainsanat   ::Avainsanat
    assignment   ::Assignment
    solution     ::Solution
    answer       ::Answer
    keywords     ::Keywords

    function TehtävänArgumentit{
        Nimi         ,
        Tehtävänanto ,
        Ratkaisu     ,
        Vastaus      ,
        Avainsanat   ,
        Assignment   ,
        Solution     ,
        Answer       ,
        Keywords     ,
    }(
        nimi         ::Nimi         ,
        tehtävänanto ::Tehtävänanto ,
        ratkaisu     ::Ratkaisu     ,
        vastaus      ::Vastaus      ,
        avainsanat   ::Avainsanat   ,
        assignment   ::Assignment   ,
        solution     ::Solution     ,
        answer       ::Answer       ,
        keywords     ::Keywords     ,
    ) where {
        Nimi         <: AbstractString,
        Tehtävänanto <: AbstractString,
        Ratkaisu     <: Union{Missing, <:AbstractString},
        Vastaus      <: Union{Missing, <:AbstractString},
        Avainsanat   <: Vector{<:AbstractString},
        Assignment   <: AbstractString,
        Solution     <: Union{Missing, <:AbstractString},
        Answer       <: Union{Missing, <:AbstractString},
        Keywords     <: Vector{<:AbstractString},
    }
        new(
            nimi,
            tehtävänanto, ratkaisu, vastaus, avainsanat,
            assignment, solution, answer, keywords
        )
    end
end

The reason both of these are parametric is that their parameters might vary from Strings to SubStrings. Now my issue is that I can’t seem to be able to build an instance of the builder type TehtävänArgumentit. Trying to generate an initial value of TehtävänArgumentit with the call

tehtävän_argumentit = TehtävänArgumentit(
        basename(tiedoston_nimi),
        "",
        "",
        "",
        String[],
        "",
        "",
        "",
        String[]
    )

results in an error

ERROR: LoadError: MethodError: no method matching TehtävänArgumentit(::String, ::String, ::String, ::String, ::Vector{String}, ::String, ::String, ::String, ::Vector{String})

I thought that the UnionAll types in the type specification of TehtävänArgumentit and Tehtävä would handle all of the possible sub-type combinations, but this is obviously not the case here. How might I change this to allow the builder to be instantiated while still allowing the types of its fields to vary during construction, as I return new builders from functions f: (builder, new_field_value) ↦ builder_with_modified_field ?

I am not familiar with the builder pattern in Rust, but generally I would compose structs, ie group relevant fields that belong together (ie can be validated & constructed as a unit) in their own struct.

That is sensible, I agree. But I guess my question is more related to the UnionAll types of Tehtävä and TehtävänArgumentit. I could split the fields into something like AssignmentLanguage{Lang <: Language, FieldTypes... <: AbstractString and Vector{<:AbstractString}}, but I would still run into issues when constructing said type with concrete Strings and SubStrings.

Why can’t Julia apply the constructor of TehtävänArgumentit to concrete types like Strings, even if the fields are declared as subtypes of AbstractString and Vector{<:AbstractString}? Do I need to define specific methods for the power set of possible concrete argument type combinations?

This is happening because you have overridden the default constructor, which normally would automatically use the types of the arguments and assign them to the parameters of the struct type. The use of a UnionAll is not directly the issue.

A much-simplified example of what you are currently doing:

julia> struct A{T <: AbstractString}
       s::T
       A{T}(s::T) where {T<:AbstractString} = isempty(s) ? error("cannot be empty") : new(s)
       end

julia> A("S")
ERROR: MethodError: no method matching A(::String)
Stacktrace:
 [1] top-level scope
   @ REPL[2]:1

Because we have overridden the default constructor, we must explicitly pass the type parameters like so:

julia> A{String}("S")
A{String}("S")

julia> A{String}("")
ERROR: cannot be empty
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] A{String}(s::String)
   @ Main ./REPL[1]:3
 [3] top-level scope
   @ REPL[9]:1

(We will come back to this: you could now write an outer constructor to assign the types if you wanted.)

Instead, if you add a non-default constructor without the type parameters after the struct name, then this works more as expected:

julia> struct B{T <: AbstractString}
       s::T
       B(s::T) where {T<:AbstractString} = isempty(s) ? error("cannot be empty") : new{T}(s)
       end

julia> B("S")
B{String}("S")

julia> B("")
ERROR: cannot be empty
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] B(s::String)
   @ Main ./REPL[4]:3
 [3] top-level scope
   @ REPL[8]:1

Note that there are officially no methods for A! The internal constructor of A{T}(s) where T form is not really a method. Meanwhile B does have the one non-default constructor method defined.

julia> methods(A)
# 0 methods for type constructor:

julia> methods(B)
# 1 method for type constructor:
[1] B(s::T) where T<:AbstractString in Main at REPL[4]:3

Note also that the parameterised constructor B{T}(s::T) where T has been overridden, and so it is now impossible to directly supply the type parameter:

julia> B{String}("S")
ERROR: MethodError: no method matching B{String}(::String)
Stacktrace:
 [1] top-level scope
   @ REPL[10]:1

However, I believe in general it’s recommended to keep the type parameter on the inner constructor, and add convenience outer constructors which call this restricted inner constructor. That way it’s easier to add other convenience outer constructors (say, a keyword argument version) later:

julia> struct C{T<:Union{<:AbstractString,Missing}}
           s::T
           C{T}(s::T) where T = ismissing(s) ? new{T}(s) : isempty(s) ? error("empty s") : new{T}(s)
       end
           
julia> C(s::T) where T = C{T}(s)
C

julia> C(missing)
C{Missing}(missing)

julia> C("")
ERROR: empty s
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] C
   @ ./REPL[11]:3 [inlined]
 [3] C(s::String)
   @ Main ./REPL[14]:1
 [4] top-level scope
   @ REPL[16]:1

julia> C("S")
C{String}("S")

There is an example of essentially exactly your case in the manual, but it’s not necessarily clear which are the default inner constructors, and how these are overwritten by one’s own inner constructors, especially in regard to the true methods. This would be a good improvement to the manual.

I also am not familiar with the builder pattern. It seems like it allows you to create a type from a duplicated version of the type with different constructors. In that case, why not define another outer constructor Tehtävä(::Tehtävä)? If the desire is to avoid the checks on the fields during parsing, then this could be done by defining two inner constructors: one without type parameters (called by default and which checks), and one with (which does not):

julia> struct C{T<:Union{<:AbstractString,Missing}}
           s::T
           C(s::T) where T = ismissing(s) ? new{T}(s) : isempty(s) ? error("empty s") : new{T}(s)
           C{T}(s::T) where T = new(s)
       end
       C(c::C{T}) where T = C(c.s)
C

julia> Base.parse(::Type{C}, str::AbstractString) = C{typeof(str)}(str)

julia> c = parse(C, "")
C{String}("")

julia> C(c)
ERROR: empty s
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] C(s::String)
   @ Main ./REPL[31]:3
 [3] C(c::C{String})
   @ Main ./REPL[31]:6
 [4] top-level scope
   @ REPL[33]:1

Finally, I would recommend from my own experience not to worry too much about conversion from SubStrings to Strings unless you later discover this is truly an issue for you. If you need to worry about different string encodings, etc., then allowing multiple different subtypes of AbstractString in the same type sounds somewhat scary! Perhaps it would be better for all the string types to be the same, specified by a single type parameter?