How would you write a beautiful DSL?

rvignolo · December 13, 2019, 6:38pm

Please jump to this post below if you do not want to read my implementation and want to code it yourself!

Hi, community,

I am trying to write a beautiful DSL so the user can feel that she is writing mathematical equations as in a piece of paper. So far, I have two approaches in my head and I want to discuss them with you. Please, stay tuned.

I have seen that sometimes there is a model defined that is given as input for macros. For example, take a look at the JuMP.jl package. Following that approach and using a really simple example where two scalar variables are defined (more complex examples include the definition of vectors, matrices, functions, etc), I could do something like this:

model = Model()
@var(model, x ,1.)        # similar to @var model x = 1.
@var(model, y, x + 1.)    # similar to @var model y = x + 1.

where Model is described as:

const Value = Union{Number, AbstractArray{T} where T<:Number}
const ValueOrNothing = Union{Value, Nothing}

mutable struct Variable
    name  :: Symbol
    value :: ValueOrNothing
end

struct Model
    vars :: Dict{Symbol, Variable}
end

Let’s take a look at the source code of @var:

const GeneralExpr = Union{Symbol,Expr,Float64,Int64}

"""
Performs a variable definition or assignment following a pattern of the type
{ Symbol | Expr } = { Literal | Symbol | Expr }, e.g.:
x      = 1.                # Symbol = Literal
x      = y                 # Symbol = Symbol
x      = y + z             # Symbol = Expr
v      = w                 # Symbol = Symbol
v      = [i for i in 1:3]  # Symbol = Expr
v[1]   = 1.                # Expr   = Literal
v[1]   = a                 # Expr   = Symbol
v[1:3] = [1, 2, 3]         # Expr   = Expr
v[i:j] = [z for z in i:j]  # Expr   = Expr
where `y`, `z`, `w`, `a`, `i` and `j` are previously defined variables.
"""
macro var(model::Symbol, lhs::Union{Symbol,Expr}, rhs::GeneralExpr)

    # checks
    lhs isa Expr &&
    lhs.head != :ref &&
    error("unexpected left hand side '$lhs' for assignment.")

    # get the variable name if assignment is of type v[i] = x
    name = lhs isa Symbol ? lhs : lhs.args[1]

    return quote

        # checks if provided model is of Model type
        _valid_model($(esc(model)), $(quot(model)))

        # check if variable already exists in model. If it does not, create it.
        if getobject(vardict($(esc(model))), $(quot(name))) == nothing
            define_variable($(esc(model)), $(quot(name)))
        end

        # build a function Expr (explained above!)
        assignment_ex = build_assignment(
            $(quot(lhs)),
            $(quot(rhs)),
            $(esc(model)),
        )

        # generate a function that performs the assignment
        instruction_assignment = $__module__.eval(assignment_ex)
        
        # perform the assignment 
        instruction_assignment ()
    end
end

If we expand the first macro call @var(model, x, 1.) we get:

quote
    umc._valid_model(model, :model)
    if umc.getobject(umc.vardict(model), :x) == umc.nothing
        umc.define_variable(container, :x)
    end
    octopus = umc.build_assignment(:x, 1.0, model)
    dotterel = (Main).eval(octopus)
    dotterel()
end

where octopus is actually a expression of a Function that performs the assigment of a variable:

quote
    octopus::Model = model -> begin
        x = (octopus.vars[:x]).value
        (octopus.vars[:x]).value = 1.0
    end
end

This expression is evaluated and transformed into a function pointer dotterel, which is then called. Since the function has a model as a default argument of type Model, there is no need to provide any argument for the function call at dotterel(). However, this argument is not const and it is also a struct with a dictionary of elements of Variable type, with not type-stable value. Could someone help me with this?

For the second macro call, @var(model, y, x + 1.) we get:

octopus::Model = model-> begin
        y = (octopus.variables[:y]).value
        x = (octopus.variables[:x]).value
        (octopus.variables[:y]).value = x + 1.0
    end

So we can use the value of x to perform the assignment.

The other approach would be completely different and given by:

@input begin
    @var x = 1.       # or just x = 1.
    @var y = x + 1.   # or just y = x + 1.
end

Let’s leave this approach for later!

Thank you very much!

doomphoenix-qxz · December 13, 2019, 6:54pm

Plug for ModelingToolkit.jl, which seems to do something similar to your first option (I’m not really an expert in this though!).

rvignolo · December 14, 2019, 12:08am

I am aware of ModelingToolkit.jl. Although it is a nice package, it is not what I am looking for. Thanks!

anon92994695 · December 14, 2019, 12:19am

Would you mind ammending your original post for us lay people who have no idea what a DSL is?

rvignolo · December 14, 2019, 12:42am

Yes of course!

DSL stands for Domain-Specific Language and in Julia, it refers to using macros to be able to write an input file that describes a problem to solve or an instruction in a better or simpler way than coding using Julia’s syntax. This also means that the user does not need to know how to code in Julia, she only needs to know macro syntax calls.

On the other hand, if you do not want to read what I coded, the question to answer is the following:

How would you implement the following macros?

model = Model()
@var model x = 1.
@var model y = 1. + x
@var model n = 10
@var model v = zeros(n)
@var model v[1] = 1.0
@var model i = 5
@var model v[i] = 234. + x
@var model a = pi + sqrt(2.)
@var model b = 1.0 + x
@fun model f(x) = a * x + b  # where a and b are the variables above and if they change, the function must change as well.
@var model s = f(1.0)
@var model s = f(a)
# etc...

or maybe, the other way arround would be:

@input begin
    @var x = 1.
    @var y = 1. + x
    @var n = 10
    @var v = zeros(n)
    @var v[1] = 1.0
    @var i = 5
    @var v[i] = 234. + x
    @var a = pi + sqrt(2.)
    @var model b = 1.0 + x
    @fun f(x) = a * x + b  # where a and b are the variables above (but not x), and if they change, the function must change as well.
    @fun h(t) = f(t) + t
    @var s1 = f(1.0)
    @var s2 = h(a)
    # etc...
end

It is quite a challenge !

rvignolo · December 14, 2019, 12:43am

Also, the solution must be fast!

mohamed82008 · December 14, 2019, 1:23am

Use of eval in the macro is bad. If the user does @var x = 1, you can add whatever model book-keeping code you want to the expression output by the macro, followed by x = 1 to run the actual code the user gave you. This will do the same thing as what you seem to be trying to do above but without eval. It’s not clear what you want to do with your DSL though. For instance, you may not need @var on each line. You can allow the user to define multiple variables in a begin .. end block. The first thing is to figure out what you are trying to do though. Also Internals & Design doesn’t seem like the correct category for this question. It seems like a usage question.

rvignolo · December 14, 2019, 1:45am

Hi! First of all, thanks for your reply!

Use of eval in the macro is bad.

This is not necessarily true, as many developers in Julia say.

This will do the same thing as what you seem to be trying to do above but without eval.

What about the case where your variables depend on other variables? Or defining a function. Or a vector in specific indexes?

For instance, you may not need @var on each line. You can allow the user to define multiple variables in a begin .. end block.

Yes, I have already coded it as @vars but it just changes the begin...end block for @var macro calls. Then, I just want to focus on @var.

The first thing is to figure out what you are trying to do though.

Okay, that is fair. But it seems that if I explain the whole problem it will take a while. Shortly, I want to define scalar variables, vectors, matrices, functions, etc, to build expressions that describe the drift and diffusion of stochastic processes to simulate with DifferentialEquations.jl. But I will explain expand more on that later.

Maybe someone arrives with a good idea with all the provided information.

Thank you!

mohamed82008 · December 14, 2019, 2:03am

Well it’s bad most of the time and definitely not required in your example above as far as I can tell.

Well let’s consider an example:

@var x = 1
@var y = [i*x for i in 1:10]

The second line uses x but x was defined in the current scope because we ran x = 1 when expanding @var x = 1. So when expanding the second line, y = [i*x for i in 1:10] will just work and do the right thing. Depending on what you want to allow the user to write, you can find cases where simply running the code the user gives you doesn’t work, but that’s why you need to first identify what you want the user to be able to write, and what the desired behavior is.

I would advise against starting with metaprogramming. Start with functions and define your functional API. Then when you cannot make your API any prettier with functions, consider macros to do specific tasks. That way you know exactly what the macro is supposed to do.

oxinabox · December 14, 2019, 2:05am

I don’t think I have ever head a Julia developer say anything close to “using eval in macros is ok”.
There is a chance I might have heard: “We use a macro in this eval as a hack around …” but even then I am not sure.

With that said, this is not strictly speaking using eval in a macro, in the normal sense.
It is using eval in the expression the macro returns.
Which is kinda less bad.

But with that said also:
the whole expression in the openning post that the macro returns is could use some work.
I think without too many changes, it could be written to note use quot nor eval and that would be much clearer.

so I am kinda disapointed in JuMP here.
If one looks at the actual source, one can kinda see why it is that way.

github.com

jump-dev/JuMP.jl/blob/04735d2648c5fbcf388b38f7f7d860fb474b772b/src/macros.jl

#  Copyright 2017, Iain Dunning, Joey Huchette, Miles Lubin, and contributors
#  This Source Code Form is subject to the terms of the Mozilla Public
#  License, v. 2.0. If a copy of the MPL was not distributed with this
#  file, You can obtain one at http://mozilla.org/MPL/2.0/.

using Base.Meta

_is_sum(s::Symbol) = (s == :sum) || (s == :∑) || (s == :Σ)
_is_prod(s::Symbol) = (s == :prod) || (s == :∏)

function _error_curly(x)
    Base.error("The curly syntax (sum{},prod{},norm2{}) is no longer supported. Expression: $x.")
end

include("parse_expr.jl")

"""
    _add_kw_args(call, kw_args)

Add the keyword arguments `kw_args` to the function call expression `call`,

This file has been truncated. show original

but it could definately do with some extra design thought / refactoring.
and I would not directly draw inspiration from that.

odow · December 14, 2019, 2:09am

We agree that the JuMP macros are sorely in need of a rewrite. They are old and outdated, and have been through many Julia versions without modification. It’s just a question of developer time and priorities.

rvignolo · December 14, 2019, 2:15am

Thank your for your reply!

In this case, wouldn’t I be using global variables to define other global variables? I mean, if the macro returns these expressions, wouldn’t it be the same as writing in the global scope:

x = 1
y = [i*x for i in 1:10]

Or the macro returns an expression which runs in a different scope? I have no idea.

On the other hand, how would you define a function depending on these variables but without using them in the global scope?

I am probably wrong, but I am learning at this point!

Thanks!

rvignolo · December 14, 2019, 2:19am

This is correct.

Maybe, if you have time you could help me a little bit. Thanks in advance!

mohamed82008 · December 14, 2019, 2:29am

Well they will be in global scope if your macro is not called inside a function. If you esc the expression output by the macro, it will be run in the caller’s scope. So:

function f()
    @var x = 1
    @var y = [...]
end

will define x and y as local variables in f (see this for more on esc). This is why macros are nicer than eval because they don’t have to run in global scope beside not hurting type inference, which eval does.

No problems

mohamed82008 · December 14, 2019, 2:32am

If this is a closure inside another function, it will be fast unless you hit some nasty compiler bug. For example:

function f()
    @var x = 1
    function g()
        @var y = x + 1
    end
end

is fine.

rvignolo · December 14, 2019, 2:35am

Okay, perfect!

The macro I believe you are proposing is the following:

macro var(expr)
    return :($(esc(expr)))
end

On the other hand, is there any other way than just wrapping everything in a function call to define a different scope?

Regarding functions, you said that:

which is what I was going to ask, so great. Just to clarify, is this fast:

function f()
    x = 1
    f = t -> x * t
    # f(t) = x * t
end

? I would expect to get that code from doing:

function setscope()
    @var x = 1
    @fun f(t) = x * t
end

mohamed82008 · December 14, 2019, 2:51am

You can selectively esc parts of your expression to evaluate as-is in the calling scope. Read the hygiene section in the docs for what happens to non-esced names, some are resolved within the macro definition module rather than the calling scope, and some are renamed to avoid conflicts with other variables in the calling scope.

Should be yes.

mohamed82008 · December 14, 2019, 2:56am

If you don’t want to worry too much about hygiene, just esc the whole thing but make sure that when you use “temporary” variables in the macro’s output expression, that these variables’ names are generated using gensym, e.g. xtemp = gensym(:x), then you can return esc(:($xtemp = 1; x = $(xtemp)^2)), where x here refers to the caller’s x but xtemp is a temporary variable that is not used again.

rvignolo · December 14, 2019, 3:20am

I just realize that all I wanted to do was to avoid using the global scope and I can do it by wrapping everything in a function. Also, since using closures do not kill performance, I can use them without any performance issue. For example:

macro var(expr)
    return :($(esc(expr)))
end

macro fun(expr)
    return :($(esc(expr)))
end

function setscope1()
    x = 1
    f(t) = x * t

    x, f(10)
end

function setscope2()
    @var x = 1
    @fun f(t) = x * t

    x, f(10)
end

function setscope3()
    x = 1
    f(x,t) = x * t

    x, f(x, 10)
end

give:

@btime [setscope1() for i in 1:10000];
  13.287 μs (2 allocations: 156.33 KiB)
@btime [setscope2() for i in 1:10000];
  13.522 μs (2 allocations: 156.33 KiB)
@btime [setscope3() for i in 1:10000];
  13.739 μs (2 allocations: 156.33 KiB)

So there is no point in using those macros, but I am still not sure how to avoid to wrap everything inside a function… I don’t like it. The other way around would be to use const for each variable. But I also don’t like it.

I am aware of hygiene so I believe I can handle this subject if needed.

I am really glad you help me!

mohamed82008 · December 14, 2019, 3:42am

Good luck!

Topic		Replies	Views
Using variables from call space in macro return execution General Usage macros , scope	18	1656	January 27, 2018
Heavy macro use or not? Community	24	1030	June 5, 2024
DSL help with nested macros General Usage metaprogramming	5	886	September 2, 2019
Macro to automatically create large number of variables General Usage macros	6	1005	August 12, 2018
Interpolation in macro calls General Usage question	10	2931	June 23, 2019

How would you write a beautiful DSL?

Related topics