Generating Custom Structs

For a variety of reasons, it would be very useful for several projects within AlgebraicJulia · GitHub to generate struct definitions based on values that are only present at run time.

Essentially, we are working with “schemas”, which can be programmatically generated and manipulated, and we want to turn these “schemas” into custom structs. We could instead make do with a generic data structure with the schema as a type parameter, which is what we have been doing, however this has several downsides, namely unintelligible error messages and lack of control over subtyping.

Currently, the only way I can think of doing this is by evaling the struct definition at runtime. However, this does not play well with world-age or package load time. These struct definitions do not change very much; the semantics that we want are the same as generating the struct definitions in CI and then adding them as files, but it seems like this is completely circumventing the Julia macro system and would not be very composable.

Any advise/thoughts on this would be greatly appreciated.

1 Like

How about named tuples?

1 Like

Maybe have build.jl read the schemas and export the struct definitions to a .jl file? 5. Creating Packages · Pkg.jl

That’s kind of our current solution; it’s bad because of error messages and subtyping.

I’m kind of leaning this way, @epatters thinks that this would be disrespectful to the julia code generation system.

Can you supply more detail about these custom structs?

  • Do they ever specify an abstract supertype?
  • Are they ever/always parameterized
    • (if so, how many params and which struct fields use the params)
  • Are the fieldnames and fieldtypes always supplied
  • Is there anything unusual about these structs?
    • Do they every use internal constructors?
    • Do they require customized external constructors?
  • what did I omit?
2 Likes
  • Yes they do specify an abstract supertype
  • They are often parameterized
    • Variable params, though could be one Tuple param. Several buried struct fields use the params
  • The fieldnames and the fieldtypes are always supplied
  • Nothing particulary unusual about the structs

Here’s an example


struct WeightedGraph{T}
  objects::NamedTuple{(:V,:E), Tuple{Ref{Int},Ref{Int}}}
  homs::NamedTuple{(:src,:tgt), Tuple{Vector{int}, Vector{Int}}
  attrs::NamedTuple{(:weight), Tuple{Vector{T}}}
  hom_indices::NamedTuple{(:src,:tgt), Tuple{Vector{Vector{Int}}, Vector{Vector{Int}}}}
  attr_indices::NamedTuple{(),Tuple{}}
end

I assume that in any given run of the software you need to make some struct types and do not need to make many thousands of distinct struct types (although your client may construct many, many struct instances).

Given the constraints you state, in your place, I would write a few functions that, taken together, would generate the desired string, which when Meta.parse d and then evaled generates the intendted struct type.

Once that works – either just leave it so, or convert the stringizing into expr-izing. It will be easier for you to do working from something already working. You could apply regexes and use match replace etc to get the correct string ready.

A good alternative, though it requires some learning, is to use MacroTools,and walk, find, prune, replace using some premade expression trees that cover the schema templates.

Oh, constructing the AST is not a problem at all. I already have code that constructs the AST, and various constructor methods, etc. The problem is when to evaluate the AST. I.e., if I eval at runtime, then it slows down the loading of the library and there are world-age issues.

Sorry, I should have been more clear about exactly what I was asking.

What about macros? Could you have the caller define the type with a macro, something like:

@Algebraic WeightedGraph{Int, Float64} 
# or Maybe:
@Algebraic name=WeightedGraph mainType=Int subtype=Float64

The macro would define the object and a default constructor. Then the caller can pass in a WeightedGraph() and the standard code would populate it since I’m assuming the property names would be known ahead of time.

that could work
@olynch look at the @stevengj implementation of @NamedTuple in Base:namedtuples. (reproduced below)


"""
    @NamedTuple{key1::Type1, key2::Type2, ...}
    @NamedTuple begin key1::Type1; key2::Type2; ...; end
This macro gives a more convenient syntax for declaring `NamedTuple` types. It returns a `NamedTuple`
type with the given keys and types, equivalent to `NamedTuple{(:key1, :key2, ...), Tuple{Type1,Type2,...}}`.
If the `::Type` declaration is omitted, it is taken to be `Any`.   The `begin ... end` form allows the
declarations to be split across multiple lines (similar to a `struct` declaration), but is otherwise
equivalent.
For example, the tuple `(a=3.1, b="hello")` has a type `NamedTuple{(:a, :b),Tuple{Float64,String}}`, which
can also be declared via `@NamedTuple` as:
```jldoctest
julia> @NamedTuple{a::Float64, b::String}
NamedTuple{(:a, :b), Tuple{Float64, String}}
julia> @NamedTuple begin
           a::Float64
           b::String
       end
NamedTuple{(:a, :b), Tuple{Float64, String}}

“”"

macro NamedTuple(ex)
    Meta.isexpr(ex, :braces) || Meta.isexpr(ex, :block) ||
        throw(ArgumentError("@NamedTuple expects {...} or begin...end"))
    decls = filter(e -> !(e isa LineNumberNode), ex.args)
    all(e -> e isa Symbol || Meta.isexpr(e, :(::)), decls) ||
        throw(ArgumentError("@NamedTuple must contain a sequence of name or name::type expressions"))
    vars = [QuoteNode(e isa Symbol ? e : e.args[1]) for e in decls]
    types = [esc(e isa Symbol ? :Any : e.args[2]) for e in decls]
    return :(NamedTuple{($(vars...),), Tuple{$(types...)}})
end

So, the reason why creating this using a macro does not work is that the struct is generated based on a runtime value, not runtime syntax. We have sophisticated functions that modify/combine schema, and then we want to generate the struct based on the output of those functions, which is only available at runtime.

Well, yes, if you depend on runtime values, then macros are not the tool for the job. I do not believe there is a way without using eval and dealing with age-of-the-world problems.

I honestly have no idea how this will work with precompilation, but you can just hardcode the path to the schema file in the macro and open the file and parse it to determine your structs.

macro foo()
    open("schema.txt", "r") do f
       schema_to_struct_expr(f)
    end
end