Is this design pattern of `::Val{}` dispatching evil?

I’d like some feedback on a seemingly functional design involving recursion and a whole lot of ::Val{} overloads. I think that my time in the Julia community has conditioned me to shy away from such dispatching and probable compiler inference issues but this sort-of seems to work.

Some quick context: I’m refactoring Latexify.jl in order to both enable features that were previously hard to implement and to make it way more extensible for users and package developers. The previous code used to latexify expressions by essentially recursively latexifying the leaves of the expression tree. Find the bottoms of the expression → turn into strings → go up one level → go again. One key part there was the conversion of the leaves to strings which was done in a function with just a whole bunch of if elseif elseif .... elseifs. Like if + then convert an operation to a string with join(args, " + "), if / then do something else. You get the point - a massive conditional.

Now, for the potential replacement (or an MWE of it):

## Some utils that are handy here.
value(::Val{T}) where T = T
ValUnion(x) = Union{typeof.((Val.(x)))...}


## what to do when the recursion reaches the bottom
func(x) = string(x)
## if expression, keep going
func(ex::Expr) = func(ex.head, ex.args[1], ex.args[2:end])
## if symbols -> create Val{} for overloading.
func(head::Symbol, operation::Symbol, args) = func(Val{head}(), Val{operation}(), args)

### Specify how to latexify each operation - or groups thereof.
## Note that these recurse by calling `func` on `args`
func(::Val{:call}, op::Val{:-}, args) = "$(func(args[1])) - $(func(args[2]))"
func(::Val{:call}, ::Val{:/}, args) = "\\frac{$(func(args[1]))}{$(func(args[2]))}"
func(::Val{:call}, op::ValUnion(:+, :*), args) = join(func.(args), " $(value(op)) ")


### try it out
ex = :(a + b * c / (d + c))
func(ex)
## returns  "a + \\frac{b * c}{d + c}" - yay!

In a massive conditional, it would be hard for anyone but me to add new behaviour, but here one can just do

func(::Val{:call}, ::Val{:textcolor}, args) = "\\textcolor{$(args[1])}{$(func(args[2]))}"

and boom! - you can color parts of your expression

julia> ex = :(a + textcolor(red, b * c) / (d + c))
julia> func(ex)
"a + \\frac{\\textcolor{red}{b * c}}{d + c}"

The ValUnion function allows me to, for example, have a list of all trigonometric functions that should all be treated just about the same and to do that in a single overload.

My real prototype is a bit more involved and has more features, but here you can see the basic design pattern: define a beginning and an end to the recursion. Then allow Val{} overloading for any step in between.

I’ll end up with maybe 100 different func methods (I have to find a reasonable name for that one).

My initial benchmarking seems to say that performance is similar to the massive conditional I used before. I suspect that it’s the string operations that takes the most time, not method look-up, but I have not dug deep yet. It’s currently on the order of 100 microseconds - not a huge deal since latexify should never be a performance bottleneck.

So, the question is pretty much: Is this design pattern fine in my case and is it fine in general for allowing more extensible alternatives to big conditionals? Or, as I seem to have been conditioned to think, is it evil?

3 Likes

Using dispatch for this feels perhaps a bit non-flexible. For example, right now you have had to extract out the :call symbol into its own argument so that one just dispatches on the name of the call. More general would be that one match on an arbitrary expression, however, this doesn’t work going with the proposed method because you can only have one function for each type. You want to do something here on runtime values but you try to lift that into the type domain. That doesn’t seem like a good idea because you either have to restrict the flexibility of the matching functions (like is done here), or you have to keep adding more and more functions to overload. Whenever you have a design where the number of methods scale pretty much without bounds I think that is usually an indication that some other design is preferred.

Here, if all you want to do is to store a callback based on a symbol you could just have a dictionary Dict{Symbol, Function} where users callbacks end up. If you want to allow for the more flexible Expr matching you can just have a Vector{Function} where each function in there returns the string if it matches or nothing if it doesn’t matches. You can of course pepper higher-level API on top of that to make it easier for users:

module Latexify

export latexify, latexify_add_matcher

const MATCHING_FUNCTIONS = Function[]

add_matcher(f) = push!(MATCHING_FUNCTIONS, f)

_check_call_match(e, op::Symbol) = e is a Expr && e.head === :call && e.args[1] === op

# User would write:
#   text_color_latexify(e) = "\\textcolor{$(args[1])}{$(latexify(args[2]))}"
#   add_call_matcher(text_color_latexify, :textcolor)
function add_call_matcher(f::Function op::Symbol)
    push!(MATCHING_FUNCTIONS, e -> _check_call_match(e, op) ? f(e) : nothing
end

function latexify(e::Expr)
    for f in MATCHING_FUNCTIONS
        if f(e) !== nothing
            break
        end
    end
end

end
5 Likes

Expr2Latex.jl (deprecated) was written making extremely heavy use of dispatching on Val{...}.
It even does a fair bit of multiple dispatch on 2 Val’s, like you are talking about.

I would say it is clean and suitable for this purpose.
It’s not super fast. If statements are an order of magnitude or so faster than dynamic dispatch.
But it doesn’t have to be fast.
It’s display code. Just need to hit 1ms or so and humans can’t tell at all.

1 Like

Thanks a lot for the reply - interesting suggestion!

My MWE might have been a touch too minimal since it fails to convey one rather important feature that’s needed. The latexification of one operation might depend on both the operation above in the expression tree and the expressions below. Like does a + b need pathentheses? It depends on what came above in the tree c * (a + b). In my prototype, I pass along the information regarding the current operation when I recurse to the next one and the downstream operation lives inside of args. I then have all the info I need.

Your suggestion does, however, make the user extensibility more sane. Relying on overloading makes for piracy and Latexify.jl’s behaviour might depend on what packages the user have loaded. A variant of your solution might give me better control over such matters.

Also, I’ll need to dispatch/whatever on the expression head too. In my example that would look like

func(::Val{:=}, lhs_val, rhs_array) = "$(func(value(lhs_val))) = $(func(first(rhs_array)))"

All this might still be feasible to get nice with a list/dict to do the lookup from though.

As a motivating example, let’s say you want to latexify all variables (symbols) called eps to \epsilon . There is no clean way to do this using dispatch unless you go ahead and introduce yet another overloading function func_symbol(v::Val{T}) where {T <: Symbol} = string(T) . Letting the functions just take the runtime values you would just add

e -> e === :eps ? "\\epsilon" : nothing

to the list of matching function. And if someone else wants to add

e -> e === :lambda ? "\\lambda" : nothing

then that is totally fine.

Basically, using dispatch, you have to a priori figure out all the possible ways a user would want to extend things and make specific functions for that that can be overloaded.

You just need to have a function that looks like

match_equals(e) = (e isa Expr && e.head === :=) ? 
    "$(latexify(value(e.args[1]))) = $(latexify(first(e.args[2])))" :
    nothing

in your set of matching functions.

3 Likes

Oooh, that’s a nice consequence! I currently do unicode to latex conversion using string match/replace and a massive dict. This would be a lot nicer!

Edit: aahh, I see that this might still not be applicable since it won’t be readily available in the expression tree.

I think I need to make another prototype based on your suggestions @kristoffer.carlsson. I don’t think that I have fully groked all the details/implication here but that should clarify itself when I actually start coding it up! Thanks!

Just to add a little to the discussion, I would not recommend having a global Dict like the one suggested by @kristoffer.carlsson, but instead have a global Dict this way:

const DEFAULT_MATCHER = ... # only used internally

defaults() = copy(DEFAULT_MATCHER) # API way of getting the default
export defaults

# latexify receives a parameter with the config
function latexify(e :: Expr, default = DEFAULT_MATCHER)
    ...
end

So the user can have their own matcher object (obtained by means of calling defaults() and adding their own functions) or even create one from scratch. If the matcher object is more rich in meta-information the user may decide programmatically what to discard or keep from the defaults before adding their own functions. Also, if someone uses more than one dependency that uses Latexify for printing, the customizations of each dependency creates will be local to them, unless the user explicitly request the configuration it is using.

1 Like

Thanks for the input! I agree that there should be some separation between default and user-supplied functionality. I’m not sure that this precise set-up would work since I’d not have a dict but rather a list of functions (allows for matching of multiple properties). Inspecting and modifying this list might be a bit messy. An alternative is to allow the user to supply their own object that just has higher priority than the default.

I’ll have to think on this for a bit, but it could indeed be rather cool to allow a complete separation of the algoritm and the object that maps code to latex. One could then split the package into like LatexAlgs and LatexConfigs, allowing for others to make their own (maybe better, or just more specific to a field) maps and creating a LatexBetterConfigs.jl package from that.

In case this is still relevant, I just wrote a small package called ValSplit.jl that allows users to dispatch on Val-typed arguments without incurring the overhead of dynamic dispatch!

In contrast to using constant global dictionary, it’s both faster and more extensible. (Unlike a global dictionary though, it will prevent users from modifying what function a particular Symbol maps to, if the corresponding method is already defined in the base package.)

2 Likes