Do I need a macro to compile a list of functions?

Mikhail_Kagalenko · April 9, 2024, 5:53pm

Need to look into your suggestion in more detail. Meanwhile, the MWE for solution with types an multiple dispatch is below.

abstract type AbstractProblemSet end

abstract type Set3 <: AbstractProblemSet end

mutable struct Problem3_1 <: Set3
    swimming_pool_size_liters::Int
    pipe_inflow_liters_sec::Int
    pipe_outflow_liters_sec::Int
    time_to_fill_min::Float64
end


function problem(problem_type::Type{Problem3_1})
    swimming_pool_size_liters = rand(1000:10:2000)
    pipe_inflow_liters_sec = rand(20:30)
    pipe_outflow_liters_sec = rand(5:10)
    p = problem_type(swimming_pool_size_liters,  pipe_inflow_liters_sec,
                 pipe_outflow_liters_sec, -1)
    problem_solution!(p)
    return p
end
function problem_solution!(p::Problem3_1)
    p.time_to_fill_min = p.swimming_pool_size_liters/(p.pipe_inflow_liters_sec-p.pipe_outflow_liters_sec)/60
    return p
end

function make_set(
    roster::AbstractDataFrame, S::Type{<:AbstractProblemSet}, rng_seed::Integer
    )
    N_students = nrow(roster)
    col_names = [fieldnames(pr) for pr in subtypes(S)]
    param_lengths = [length(nms) for nms in col_names]
    col_names = mapreduce(identity, vcat, col_names)
    M = length(col_names)
    problems_data = Matrix{Any}(missing, N_students, M)
    for k in 1:N_students
        problem_types = subtypes(S)
        for n in 1:length(problem_types)
            idx_1 = sum(param_lengths[1:n - 1]) + 1
            idx_2 = idx_1 + param_lengths[n] - 1
            Random.seed!(rng_seed + k + n)
            pr = problem(problem_types[n])
            problems_data[k, idx_1:idx_2] .= [getproperty(pr, f) for f
                                                  in fieldnames(problem_types[n])]
        end
    end
    for m in 1:M
        column_type = promote_type(typeof.(filter(!ismissing, problems_data[:, m]))...)
        roster[!, col_names[m]] = Vector{Union{column_type,Missing}}(problems_data[:,m])
    end
    
    return rstr
end

Mikhail_Kagalenko · April 10, 2024, 3:43pm

There’s a great deal of stuff going on in DynamicPPL’s compiler.jl implementation of @model macro. Is there a simplified working example of such macro?

mnemnion · April 10, 2024, 3:49pm

My only suggestions are almost aesthetic.

First, this sort of abstract type is always a set (Julia’s type system is a lattice), so the Set in the name isn’t necessary. It means that Problem3_1 isa AbstractProblemSet is true, and “set of one element” doesn’t equal “one element”, or Peano numbering wouldn’t work.

So you might try organizing the abstract types like so:

abstract type AbstractProblem end

abstract type ProblemSet3 <: AbstractProblem end

This does mean that ProblemSet3 isa AbstractProblem, which is debatable. You could use AbstractProblem3 if you wanted, but a name with Set3 in it is fine, since it helps you organize the problem domain.

Second, your concrete structs are essentially functors, so I would structure that code this way:

function problem(problem_type::Type{Problem3_1})
    swimming_pool_size_liters = rand(1000:10:2000)
    pipe_inflow_liters_sec = rand(20:30)
    pipe_outflow_liters_sec = rand(5:10)
    p = problem_type(swimming_pool_size_liters,  pipe_inflow_liters_sec,
                 pipe_outflow_liters_sec, -1)
    return p()
end
function (p::Problem3_1)()
    p.time_to_fill_min = p.swimming_pool_size_liters/(p.pipe_inflow_liters_sec-p.pipe_outflow_liters_sec)/60
    return p
end

Which I would find expresses the intention more clearly. You might feel differently, neither of these is clearly better than the other.

Mikhail_Kagalenko · April 10, 2024, 4:20pm

The call p() modifies the caller itself, which seems somewhat weird. In Julia, functions modifying their inputs are marked by the exclamation, which can’t be done in this case.

mnemnion · April 10, 2024, 4:32pm

That’s a reasonable comment, although self-modifying a callable struct isn’t unheard of. I was considering a whole tangent about whether you want a mutable struct in the first place. You could do something like this:

struct Problem3_1 <: Set3
    swimming_pool_size_liters::Int
    pipe_inflow_liters_sec::Int
    pipe_outflow_liters_sec::Int
    time_to_fill_min::Float64
end

Problem3_1(a,b,c) = Problem3_1(a,b,c,a/(b-c))

function problem(problem::Type{Problem3_1})
    swimming_pool_size_liters = rand(1000:10:2000)
    pipe_inflow_liters_sec = rand(20:30)
    pipe_outflow_liters_sec = rand(5:10)
    return p(swimming_pool_size_liters, pipe_inflow_liters_sec, pipe_outflow_liters_sec)
end

This has some advantages, I wrote the problem solution in shorthand, which is of course not necessary.

Mikhail_Kagalenko · April 10, 2024, 4:36pm

Yes, that has the advantage of not needing to supply the placeholder values when constructing the type. On the other hand, that brings back some of the copying and pasting of variable names, because the function to solve the problem ought to have the meaningful variable names for the purpose of documentation.

mnemnion · April 10, 2024, 5:29pm

The lack of the variable names was just in the interest of brevity, I can see where you’d want them included in the actual code.

Although it’s worth noting that because the structs have the field names, the code to retrieve them doesn’t require that the constructor use them. But yes, for clarity you might want the constructor to look like this:

function Problem3_1(
    swimming_pool_size_liters,
    pipe_inflow_liters_sec,
    pipe_outflow_liters_sec)
    time_to_fill_min = p.swimming_pool_size_liters/(p.pipe_inflow_liters_sec-p.pipe_outflow_liters_sec)/60
    return Problem3_1(swimming_pool_size_liters,
         pipe_inflow_liters_sec,
         pipe_outflow_liters_sec,
         time_to_fill_min)
end

I wouldn’t do it that way myself, but for pedagogy I understand the case for it.

bertschi · April 10, 2024, 5:58pm

Sorry, did just take the syntax of @model as inspiration, e.g., in using ~ for random sampling. The expansion of your macro could be much simpler.
When I find some time, I will try to put together an example.

bertschi · April 10, 2024, 7:25pm

Ok, here is a rough sketch for a macro solution:

using Random

abstract type AbstractProblem end

# Generic functions
function column_names end
function problem_data end
function problem_solution end

# Note: API as in solution by abraemer

# Define sample problem by hand
struct Problem4_1 <: AbstractProblem end

column_names(::Problem4_1) = [
    :pr4_1_swimming_pool_size_liters, 
    :pr4_1_pipe_inflow_liters_sec,
    :pr4_1_pipe_outflow_liters_sec, 
    :pr4_1_time_to_fill_min]

function problem_data(::Problem4_1, rng::AbstractRNG)
    swimming_pool_size_liters = rand(rng, 1000:10:2000)
    pipe_inflow_liters_sec = rand(rng, 20:30)
    pipe_outflow_liters_sec = rand(rng, 5:10)
    #
    time_to_fill_min = problem_solution(Problem4_1(), swimming_pool_size_liters, pipe_inflow_liters_sec, pipe_outflow_liters_sec)
    #
    return (swimming_pool_size_liters,pipe_inflow_liters_sec,pipe_outflow_liters_sec,
            time_to_fill_min)
end

function problem_solution(::Problem4_1, swimming_pool_size_liters, pipe_inflow_liters_sec, pipe_outflow_liters_sec)
    return swimming_pool_size_liters/(pipe_inflow_liters_sec-pipe_outflow_liters_sec)/60
end

# Macro to define such problems

skiplinenums(exprs) = filter(e -> !(e isa LineNumberNode), exprs)

function parse_body(body)
    defs = []
    sol = nothing
    for expr in skiplinenums(body.args)
        if expr.head == :call && expr.args[1] == :(~)
            push!(defs, expr.args[2] => expr.args[3])
        elseif expr.head == :macrocall && expr.args[1] == Symbol("@solution") && isnothing(sol)
            sol = expr.args[end]
            @assert sol.head == :(=)
        else
            error("TODO: Better error message/handling!")
        end
    end
    defs, sol
end

macro problem(name, body)
    @assert body.head == :block "Syntax error: Expecting block of definitions!"
    defs, sol = parse_body(body)
    colnames = [var for (var, val) in defs]
    quote
        begin
            struct $(esc(name)) <: AbstractProblem end
            function $(esc(:column_names))(::$(esc(name)))
                [$([:(Symbol($(string(c)))) for c in colnames]...)]
            end
            function $(esc(:problem_solution))(::$(esc(name)), $(esc.(first.(defs))...))
                $(esc(sol.args[2]))
            end
            function $(esc(:problem_data))(problem::$(esc(name)), rng::AbstractRNG)
                $([:($(esc(var)) = rand(rng, $(esc(val)))) for (var, val) in defs]...)
                $(esc(sol.args[1])) = $(esc(:problem_solution))(problem, $(esc.(colnames)...))
                ($(esc.(colnames)...), $(esc(sol.args[1])))
            end
        end
    end
end

macro problemset(name, body)
    @assert body.head == :block "Syntax error: Expecting block of definitions!"
    prob_names = []
    for prob in skiplinenums(body.args)
        @assert (prob.head == :macrocall && prob.args[1] == Symbol("@problem")) "Only problems allowed in problemset!"
        push!(prob_names, skiplinenums(prob.args)[2])
    end
    quote
        $(esc(body))
        $(esc(name)) = [$([:($(esc(prob))()) for prob in prob_names]...)]
    end
end

# Check with @macroexpand that this basically generates the same code as above for Problem4_1

@problem Problem4_2 begin
    swimming_pool_size_liters ~ 1000:10:2000
    pipe_inflow_liters_sec ~ 20:30
    pipe_outflow_liters_sec ~ 5:10
    @solution time_to_fill_min = swimming_pool_size_liters/(pipe_inflow_liters_sec-pipe_outflow_liters_sec)/60
end

Random.seed!(123)
@show column_names(Problem4_2())
@show problem_data(Problem4_2(), Random.default_rng())

@problemset MyProblems begin
    @problem Problem4_3 begin
        swimming_pool_size_liters ~ 1000:10:2000
        pipe_inflow_liters_sec ~ 20:30
        pipe_outflow_liters_sec ~ 5:10
        @solution time_to_fill_min = swimming_pool_size_liters/(pipe_inflow_liters_sec-pipe_outflow_liters_sec)/60
    end
    @problem Simple4 begin
        x ~ 1:3
        y ~ 2:5
        @solution xy = x + y
    end
end

Random.seed!(123)
@show column_names.(MyProblems)
@show problem_data.(MyProblems, Ref(Random.default_rng()))

Note that the syntax is quite strict and error handling is somewhat rough.

Mikhail_Kagalenko · April 11, 2024, 4:09pm

Thank you, that is helpful. Maybe I’ll use some ideas from you and DynamicPPL to cobble together something

Mikhail_Kagalenko · April 25, 2024, 5:01pm

I have decided that macro-based solution saves the greatest amount of drudge work, and so created an implementation that combines the ideas from your prototype and DynamicPPL’s compiler. If you find the time to contribute criticisms or suggestions, that will be most appreciated.

Topic		Replies	Views
Function in array of functions not defined New to Julia	5	744	May 8, 2018
Repeated section of code New to Julia macros	5	744	September 14, 2021
Creating multiple functions (in a different or global scope) using a one macro New to Julia question	6	588	July 22, 2022
Including a function/file in Julia New to Julia question , module , function	3	611	June 6, 2022
How to create a vector of functions and index in a for loop New to Julia question , loops , vector , functions	3	1185	November 7, 2021

Do I need a macro to compile a list of functions?

Related topics