Macro challenge: Factory for creating named tuples


#1

Named tuples are a great way to work with sets of parameters where the type can be inferred. Compared to using a named structure with Parameters.jl, the big thing missing seems to be creating them with default values.

So I wonder if there is a macro that can be written (and maybe added to Parameters.jl) which can simplify the boilerplate for this? First, I just wanted to get a sanity check that this programming pattern is the right one:

using NamedTuples, Distributions
myparameterfactory(;a=1, b = Normal(0,1), c= [1; 3]) = @NT(a=a, b=b, c=c) #Drop the @NT in v0.7

#To use
paramset1 = myparameterfactory(a=2)
paramset2 = myparameterfactory(b=Exponential(0.2))

Is “factory” the right terminology here, or is there something better?

Next: while the boilerplate here for the factory is relatively simple and idiot-proof, I am curious if there is a macro to do it automatically for an arbitrary number of parameters? i.e.

myparameterfactory = @NTfactory(a = 1, b = Normal(0,1))

I wasn’t sure how to write this sort of macro, whether it is worth it, and what it should be called. Any thoughts?


#2

It is not entirely clear to me what you want.
There is this:

these_strings = ("first", "second", "third")
("first", "second", "third")

these_names = map(Symbol, these_namestrings)
(:first, :second, :third)

julia> prototypal_namedtuple = NamedTuple{these_names}
NamedTuple{(:first, :second, :third),T} where T<:Tuple

some_values = (1, 2, 3)
(1, 2, 3)

more_values = ("1st", "2nd", "3rd")
("1st", "2nd", "3rd")

a_namedtuple = prototypal_namedtuple(some_values)
(first = 1, second = 2, third = 3)

another_namedtuple = prototypal_namedtuple(more_values)
(first = "1st", second = "2nd", third = "3rd")

#3

Thanks. Sorry, no I don’t want a prototype as I want it to be lazy


#4

It depends on what you want the named tuples to be used for? For instance, if passing those are arguments to a function, you could have them parse to a struct with each field name a default value if the named tuples does not have a the keyword.


#5

For the named tuples themselves, think of a collection of parameters and model settings that are immutable for a given calculation from a model , but where I may explore different variations on the parameters themselves. I think it is a typical use case for exploratory work. The named tuples could be splatted directly into a function arguments, etc.

The key, though, is that you may want to do all sorts of variations of the parameters from a given baseline… but since some of the “parameters” may be expensive objects with the actual type changing (e.g. change from a Normal to an Exponential distribution) you need it to be lazily created.


#6

Here’s a macro. Probably more trouble than it’s worth.

macro NTfactory(args...)
    splits = map(args) do arg
        @match arg begin
            (a_ = b_) => (a, b)
            any_ => error("All arguments must be assignments")
        end
    end
    esc(:(
        (;$(map(splits) do pair
            Expr(:kw, pair[1], pair[2])
        end...),) -> 
        $NamedTuples.@NT($(map(splits) do pair
            Expr(:kw, pair[1], pair[1])
        end...))
    ))
end

#7

Thanks so much! What libraries do I need for this to work? On v0.6 I have Match.jl installed and NamedTuple.jl installed.

I did

using NamedTuples, Match
macro NTfactory(args...)
    splits = map(args) do arg
        @match arg begin
            (a_ = b_) => (a, b)
            any_ => error("All arguments must be assignments")
        end
    end
    esc(:(
        (;$(map(splits) do pair
            Expr(:kw, pair[1], pair[2])
        end...),) -> 
        $NamedTuples.@NT($(map(splits) do pair
            Expr(:kw, pair[1], pair[1])
        end...))
    ))
end
myfac = @NTfactory(A=1, B="test")

The last line gives, ERROR: UndefVarError: b_ not defined. Do I need other packages? Is Match the right one?


#8

Sorry should have mentioned, that’s match from MacroTools.


#9

Beautiful! Exactly what I wanted, and I think it may be worth it for many problems.

Now for extra credit. Is there any way you could unpack these, in the same spirit as Parameter.jl has @unpack? The test would be

nt = @NT(a=1, b=2.0)
@unpack a = nt
@unpack a, b = nt

If you get these macros (and they can work in v0.7) I think they belong in Parameters.jl for those who want to use named tuples as a handle tools for working with named tuples instead of structs.


#10

Unpacking named tuples works fine with Parameters.jl, both in Julia 0.7 and with NamedTuples in 0.6:

julia> nt = (a=1, b=2)                                                                                                                                                   
(a = 1, b = 2)                                                                                                                                                           

julia> @unpack a = nt                                                                                                                                                    
(a = 1, b = 2)                                                                                                                                                           

julia> @unpack a, b = nt                                                                                                                                                 
(a = 1, b = 2)                                                                                                                                                           

This is because getfield works, see https://github.com/mauro3/Parameters.jl/blob/9460daa11870adfca90a10183150451ef8caca84/src/Parameters.jl#L568.

Concerning your original proposal, is don’t see that much value. With your approach you need to define a factory-function, but then why not just define the type which is the “factory” function the. I.e.:

myparameterfactory = @NTfactory(a = 1, b = Normal(0,1))
# vs
struct MyT
  a=1
  b = Normal(0,1)
end

If you are worried about key-strokes, the check out https://github.com/cstjean/QuickTypes.jl.


#11

Thanks so much for the response!

To motivate here (as my concern is not about keystrokes): I am examining ways to teach complete novice programmers how to pass around arguments for their models when writing code. The idea would be to be able to write the simplest possible code for them to understand, and serve as a baseline for their own modifications. In many of the underlying examples, you will use a baseline set of parameters and then vary them over different dimensions.

I am teaching economics and cannot dedicate time to teach a huge amount of computer science and programming, so I want to minimize the knowledge required at first.

For people who have never programmed before, there are two issues: (1) Types are tricky; (2) Generic types are really tricky; and (3) boilerplate style code is more difficult for them than you would guess to understand; (4) boilerplate code that relies on the ordering of parameter names, etc. is extremely error prone for novices.

With Parameter.jl, the boilerplate associated with creating the named parameter constructors is gone, which is great. As you say, I am not sure there is much direct benefit in using named tuples in this style,

myparameterfactory = @NTfactory(a = 1, b = Normal(0,1), c = [1.0 2.0])
# vs
@with_kw struct MyT
  a=1
  b = Normal(0,1)
  c = [1.0 2.0]
end

The problem I have is that if I teach them this pattern, is that those two definitions are not really equivalent. The named tuple infers types, but the struct code is type unstable and they would have terrible performance in many circumstances. So then I need to train them to write something like

@with_kw struct MyT{T1 <: Real, T2 <:ContinuousUnivariateDistribution, T3 <: AbstractVector{T4}} where T4
  a::T1 = 1
  b::T2 = Normal(0,1)
  c::T3 = [1.0 2.0]
end

The fact that I suspect I made a several mistakes there tells me how difficult it is to teach to novice programmers. On the other hand,

myparameterfactory = @NTfactory(a = 1, b = Normal(0,1), c = [1.0 2.0])

completely uses type inference and I don’t need to teach them about abstract types, type covariance, generic programming, etc. earlier than I want to to in a course.


#12

As for the @NTfactory macro, this I am not 100% sure of, but these things are not about keystrokes. The goal it is to removing syntactic noise and helping defensive programming for novices. My general worry is that novice programmers following a programming pattern will introduce bugs when copying and modifying code. As an example,

p = @NT(a=1, b=2)
#Unpack manually
a, b = p.a, p.b
#vs
@unpack a, b = p

The reason I like the second one is that if we add in a new parameter there is no way they can have a tough to spot bug there by accidentally putting things in the wrong order. Not to mention that the @unpack clearly tells the intent of the action. This is the same reason I like @kw_args so much.

For the @NTfactory macro, the number of bugs that can be introduced is much smaller since things don’t need to be in the right order. To compare,

myparameterfactory(;a = 1, b = Normal(0,1), c=[1 2]) = @NT(a=a, b=b, c=c)
myparameterfactory = @NTfactory(a = 1, b = Normal(0,1), c=[1 2])

I think the intent of the @NTfactory is clearer here, but it is certainly not as obvious of an improvemnt as @with_kw.

I think that novices can fill in the construction pattern above without too much confusion (i.e., the main thing they could do wrong is forget to add in a new d=d in the definition). So, @mauro3, I will let you decide if you think introducing that macro (which is kind of a named-tuple analogue to @with_key) is worth it. If so, I will prepare a PR with tests and docs for it for Parameter.jl. If not, then I think I will tell people to use the construction pattern above.


#13

That’s a good point! I didn’t think of that.

So, maybe adding it would be good. Named tuples can be thought of a little like anonymous types (i.e. functions vs anonymous functions). Maybe that could be a good way to think about it and ponder a good syntax.


#14

Yes, that is exactly the way I am thinking about them. Type inferred, immutable anonymous structures. With the @unpack notation (which I assume ends up overhead free?) it is pretty powerful already.

Is “factory” the right terminology here for the constructor?


#15

How about:

julia> f = @with_kw (a=5, b=7, c=9)                                                                                                                                      
(a=5, b=7, c=9)                                                                                                                                                                     
julia> f(a=1)                                                                                                                                                            
(a=1, b=7, c=9)                                                                                                                         

?


#16

Done. No brainer on the name. Should I try to get a PR together. @bramtayl Can we use that code in Parameters.jl? (or @mauro3, would you prefer to rewrite it with your existing macro frameworks?)

I am not much of a macro programmer, but can help get the tests and docs written for it.


#17

Yeah, a PR would be good. Bramtayl’s code should be good. But if you get stuck I can also look at it.

Edit: and as far as I’m concerned, it could be a Julia 0.7 only feature. But also ok, if you want to support 0.6 too. I’d rather not have the NamedTuple dependency, so maybe that could be “optional”.


#18

@mauro3 I looked at the macro and wasn’t sure how to make a version which does different things in v0.7. I also am not sure how to create an optional dependency?

I would love it to work for both versions, if at all possible so I can have students use it right away. If I can get the code, tests, and docs setup to work in v0.6, could you modify things to add in the optional dependency and v0.7 version?


#19

I think in 0.6 you could translate: Para1 = @with_kw (a = 5, b = 6, c = 7) with an appropriate invocation of @NT. As this invocation of @NT only happens at macro-expansion time, there is no need to have the NamedTuples dependency in Parameters, but it would need to be loaded in the module in which the Para1 = @with_kw (a = 5, b = 6, c = 7) appears.

The switch can be done with a condition like https://github.com/mauro3/Parameters.jl/blob/9460daa11870adfca90a10183150451ef8caca84/src/Parameters.jl#L196 (although the version will need to be adjusted slightly).

Anyway, by all means submit a WIP PR and I can take it from there.


#20

@bramtayl @mauro3 I have run into a problem with the macro here. Some sort of hygiene thing? Here is the one you gave me (renamed)

using Parameters, NamedTuples, MacroTools

macro with_kw(args...)
    splits = map(args) do arg
        @match arg begin
            (a_ = b_) => (a, b)
            any_ => error("All arguments must be assignments")
        end
    end
    esc(:(
        (;$(map(splits) do pair
            Expr(:kw, pair[1], pair[2])
        end...),) -> 
        $NamedTuples.@NT($(map(splits) do pair
            Expr(:kw, pair[1], pair[1])
        end...))
    ))
end

Now, note the following works fine

w = 10
@NT(a=1, b="test", w = w)

First, no problem on this

w2 = 10
argfact = @with_kw(a=1, b="test", w = w2)
argfact(a=2)

But this fails

w = 10
@NT(a=1, b="test", w = w)
argfact = @with_kw(a=1, b="test", w = w)