A tiny DSL for network filtering using generated functions


#1

I know I’m not the only one here who struggles to understand generated functions. I spent an entire day implementing this simple DSL for filtering network connections and finally got it working. I’m sharing it here partly to solicit suggestions for making my code more elegant, and partly to add to the library of examples that demonstrate the use of generated functions.

Network connections in my model are described by a simple type called Link.

immutable Link
    from::Symbol
    to::Symbol
    cargo::Symbol
end

I also have a constant called links of type Vector{Link} that contains all allowed connections in my model. The idea for making a DSL came up when I wrote a line like this for the eleventy-first time:

[whatever(L) for L in links if L.from == :a && L.to in [:b,:c] && L.cargo != :d]

I figured, wouldn’t it be cool if I could write that line like this instead?

[whatever(L) for L in link(:a, [:b,:c], !:d)]

I also wanted to be able to use * as a wildcard to allow anything in that field. Here are some more examples that describe what I wanted to do:

# dummy data to play with
const links = [ Link(:a,:b,:x); Link(:a,:c,:x); Link(:a,:d,:x);
                Link(:a,:b,:y); Link(:a,:c,:y); Link(:b,:c,:x);
                Link(:b,:c,:y); Link(:b,:d,:y); Link(:c,:d,:x);
                Link(:d,:a,:y)]

# Usage
julia> link(:a, :c, :y)
1-element Array{Link,1}:
 Link(:a, :c, :y)

julia> link(:z, :z, :z)
0-element Array{Link,1}

julia> link(:a, [:b,:c], :x)
2-element Array{Link,1}:
 Link(:a, :b, :x)
 Link(:a, :c, :x)

julia> link(!:a, [:b,:c], :x)
1-element Array{Link,1}:
 Link(:b, :c, :x)

julia> link(*, ![:b,:c], *)
4-element Array{Link,1}:
 Link(:a, :d, :x)
 Link(:b, :d, :y)
 Link(:c, :d, :x)
 Link(:d, :a, :y)

I would be using the link() filter in inner loops. That means it would need to be type stable, despite the mixed arguments of Symbols, Vectors, Functions and weird negations. Here is the complete implementation I came up with.

# Make wrappers for negated symbols and vectors.
# (must indicate negation in the type since generated
# functions can't inspect argument values)
struct NegSymbol
    s::Symbol
end

struct NegVector
    v::Vector{Symbol}
end

# Extend Base.! to allow negating a Symbol or Vector
Base.:!(x::Symbol) = NegSymbol(x)
Base.:!(x::Vector{Symbol}) = NegVector(x)

# Build an array comprehension with conditions defined by
# generated functions. It expands to something like this:
# [L for L in links if L.from == x[1] && L.to != x[2] && L.cargo in x[3]]
link(x::Vararg{Any,3}) = Link[L for L in links if conditions(L,x...)]

# It would have been nice to put the entire array comprehension
# inside the generated functions, but due to a bug/limitation in v0.6
# comprehensions and closures don't work in generated functions.

# build the expressions containing the conditions
# (the if-clause of the comprehension)
@generated function conditions(L::Link, x::Vararg{Any,3})
    conditions = :()
    firstcondition = true
    for (i, arg) in enumerate(x)
        # a wildcard means that this condition can be skipped
        arg <: Val{*} && continue
        if arg <: Symbol
            cond = :(getfield(L,$i) == x[$i])
        elseif arg <: NegSymbol
            cond = :(getfield(L,$i) != x[$i].s)
        elseif arg <: Vector
            cond = :(getfield(L,$i) in x[$i])
        elseif arg <: NegVector
            cond = :(!in(getfield(L,$i), x[$i].v))
        else
            error("Arguments must be Symbols, Vectors of Symbols or wildcards (asterisks).")
        end
        conditions = firstcondition ? :($cond) : :($conditions && $cond)
        firstcondition = false
    end
    firstcondition ? :(true) : :($conditions)
end

# these are also needed to get dispatch on Val{*}
link(f::Function, x, y) = link(Val{f}(), x, y)
link(x, f::Function, y) = link(x, Val{f}(), y)
link(x, y, f::Function) = link(x, y, Val{f}())
link(f1::Function, f2::Function, x) = link(Val{f1}(), Val{f2}(), x)
link(f1::Function, x, f2::Function) = link(Val{f1}(), x, Val{f2}())
link(x, f1::Function, f2::Function) = link(x, Val{f1}(), Val{f2}())
link(f1::Function, f2::Function, f3::Function) = link(Val{f1}(), Val{f2}(), Val{f3}())

So, any suggestions for how this code can be simplified or otherwise improved? Is there a way of doing this completely without generated functions or macros?

I guess I could have used generated functions to do the Val{*} dispatch as well, but in the end it was easier to just write out those 7 lines even if they are fugly.


#2

My understanding is that you are basically defining a predicate using a DSL.

Iterators.filter is already lazy, so if you used that, you would get non-copying performance (not that I think it matters for small vectors, but I don’t know your actual use case). So the only question is whether you need a DSL for constructing predicates. Personally, I find

L -> L.from == :a && L.to ∈ [:b,:c] && L.cargo ≠ :d

readable enough, so that I would not use a DSL for this, especially one that involves “type piracy” (defining methods for function/type combinations I did not define).

Finally, I am not sure you need generated functions, closures should do. Or did you find a performance difference between the two?