Approximating non-standard evaluation like in R, e.g. for plot labels

Sometimes I miss R’s functionality where expressions passed into plotting functions are used as label names. The non-standard evaluation part of it I find pretty confusing actually, i.e. that the expressions are not run before being passed into the functions but after. But being aware of what the user typed to offer them a bit more convenience can be really nice.

I had an idea to approximate this behavior a while ago using a type and a macro:

struct CodeValue{V}
    code::Expr
    value::V
end

macro ~(e)
    qn = QuoteNode(e)
    :(CodeValue($qn, $(esc(e))))
end

So you could dispatch on the CodeValue type for plotting recipes, in Plots for example, and use the eagerly computed value as you normally would, but the expression for creating labels. Maybe there are other use cases besides this one?

It would look like this:

plot(@~ sin(1:10))

And then the plot could have the ylabel sin(1:10) automatically applied, unless overridden by the user.

I’m just on the fence if I find this useful enough or if the additional burden of typing @~ diminishes the usefulness. Maybe you have some ideas or thoughts about this, if you have used R before you’ll know what I mean.

3 Likes

The nonstandard evaluation in R is one of the absolute WORST things about R.

This is exactly it… you don’t know what is just symbol and what is actual code.

Julia’s clear separation with the macro language is far far superior.

I don’t have a strong opinion about your specific proposal, but I have a general strong opinion that we shouldn’t make Julia into something as confusing as R. As far as I’m concerned ggplot(df) + geom_point(aes(x,y)) while it looks innocuous enough, is utter brain poison when it comes to understanding the semantics. I still don’t have a clue what aes actually does and / or how it differs from aes_ and aes_str.

4 Likes

If you’re interested, I wrote a note describing how I think about this problem these days: https://github.com/johnmyleswhite/julia_tutorials/blob/master/Rewriting%20Expressions%20to%20Tuple%20Functions.ipynb

I also have a version of R’s curve as a macro that I’ll publish soon.

1 Like

Here’s how I’d implement something like R’s curve function: https://github.com/johnmyleswhite/julia_tutorials/blob/master/Curve%20Macro.ipynb

This is very much like OP’s approach, except it generates an anonymous function rather than tries to immediately evaluate the expression passed to the macro.

1 Like

I mostly agree with @dlakelan: R’s non-standard evaluation is a very convoluted and brittle construct with very limited payoff.

The original idea was that expressions like cos(x^2) would have a dual role: as a value and at the same time encode the math expression \cos(x^2). The macro solution of @johnmyleswhite can provide this functionality, but given that

  1. not all math expressions map to code,
  2. using short variable names and expressive plot labels is common practice (eg g vs “growth rate (%, year)”)

I think it is just better to provide explicit labels all the time.

3 Likes

R’s non-standard evaluation is a very convoluted and brittle construct with very limited payoff.

I think it is just better to provide explicit labels all the time.

Yeah that was my feeling, too. I haven’t really used R except for a bit of statistics and plotting, as I found the syntax and general flexibility a bit lacking. And I explicitly do not like the non-standard evaluation as it relates to the order of execution. What I did like was the possibility of giving more detailed feedback to the user in case something goes wrong because you have access to their expressions. So if you have a function

function dothis(x)
    # something happening here
end

and it errors because of x == sin(y) * cos(z), you can output sin(y) * cos(z) is an invalid input instead of x is invalid. Which may or may not be useful given the kind of function dothis is.

I would suggest to do the R-bashing in an R-forum, where people who actually know lazy evaluation and other features of R are present. In a sense, R is lisp with a different syntax, and … julia is lisp with yet another syntax, so the differences might not be too large, if you dig a bit.

I am not sure where you get the impression that anyone is bashing R here. @jules asked about implementing a feature in Julia; the fact that it is familiar from R is incidental here.

I think it is the opposite: the deeper you dig, the more apparent the differences become. But, again, that’s a tangent to this discussion.

1 Like

I think that Julia encourages users to look at the stack trace to figure these things out, but I understand that it’s not what you are looking for here.

I agree with your argument about automatic axis labels. I’d point out that one big challenge with grabbing expressions as arguments via macros is that the expressions have already gone through a lossy pass from concrete syntax tree to abstract syntax tree, so they don’t faithfully reproduce the user’s input.

julia> String(Symbol(:(x + y))) == String(Symbol(:(x+y)))
true

I think the value of something like @curve is just that it removes the need to think about making a function. In simple cases this isn’t enough of a gain, but if you’re writing 20 anonymous functions in a row (as you might to define a complex query against a stream of tuples), the value builds up.

2 Likes

and it errors because of x == sin(y) * cos(z), you can output sin(y) * cos(z) is an invalid input instead of x is invalid . Which may or may not be useful given the kind of function dothis is.

FWIW, I think this is the best argument for using macros for defining tests:

julia> @test sin(1.0) * cos(1.0) == 0.0
Test Failed at REPL[4]:1
  Expression: sin(1.0) * cos(1.0) == 0.0
   Evaluated: 0.4546487134128409 == 0.0
ERROR: There was an error during testing
1 Like

Sorry to digress, but in R, say, dplyr you do say df %>% select(x, y). It would be nice, though not necessary to have say df |> select(:x, :y) rather than df |> @select(:x, :y). In R, users of packages don’t need to be exposed to meta-programming concepts like macros or NSE to use a useful data analysis library. Maybe this exists in Julia but excuse my naivety. I am a statistician (used R for over 15 years) who, given a little bit of exploration, thinks Julia has a lot to offer as it solves most of the quirks in some of the languages we use for our work. From my simple exploration of Julia, it’s closer to R than python and introduces a whole lot of possibilities that traditional R programmers and users would be more or less immediately excited about - a lot. Please be kind to some of us who are trying to enter new exciting territories.

NSE is meta-programming, so R is exposing statisticians to it, but in a manner much spookier and more mysterious than Julia?

2 Likes

In R, say dplyr (somewhat of a DSL analogous to SQL or LINQ), you use select as a normal function rather than a macro and the magic (lazy evaluation/capture of arguments/optimizations is hidden from the user and pushed to the developer) is deferred - arguments are promises. The user expresses what they want by providing expressions directly and the developer of the package captures the intent and reacts to it in code. Most R users are not programmers - medical doctors, demographers, sociologists, economists, public officials etc. From where I sit, other than this, Julia has better abstractions for anyone trying to do statistics and/or data science than what is out there (especially the type system and multiple dispatch) - the issue is the scarcity of libraries given the head start the rest of the other languages have, but Julia is closing the gap and given its performance, there are a lot of people waiting at the lobby. Either they are re-educated or else they will try to implement whatever the want based on what they are used to and/or find simple to reason about

Most R users don’t even know what NSE is, which I think is intentional. They just know how to express themselves and the DSLs efficiently handle the mechanics without the user even knowing that it’s used under the hood - They slowly slide into “programmer” mode over time. I am grateful for your responses. I will still keep on pushing on this as despite R’s quirkiness and my profound love for Julia’s design, there is a reason why R/MatLab are still on the map 30+ years later despite their “limitations”.

It is quite honestly far easier to learn what a macro is and then reap the huge benefit of completely sane semantics than to pretend metaprogramming doesn’t exist and then hit the wall where you are trying to figure out how to do something but it’s all mysterious voodoo under the hood. IMHO one of Julia’s enormous benefits is that it does not have nonstandard evaluation

Once you’ve spent an afternoon or two learning about macros by reading a couple articles and watching a YouTube lecture, you won’t go back (unfortunately I don’t offhand have suggestions for those articles and lectures)

2 Likes