RFC: _ as ans

bramtayl · February 27, 2017, 1:06am

There’s been a lot of discussion lately of using _ as stand in for the result of the previous evaluation, for example:

a
_ + 1
_ + 2

would get lowered to

gensym1 = a
gensym2 = gensym1 + 1
gensym3 = gensym2 + 2

I’ve got something close to this up and working in ChainRecurisve.jl in a surprisingly small amount of code. There are only two remaining issues:

Base syntax that is parsed as a block but would need to special case opt out:

type a
    b::Int
    c::Int
end

does not mean

type a
    gensym1 = b::Int
    gensym2 = c::Int
end

and

(args...; kwargs...) -> f(args..., kwargs)

does not mean

(begin 
    gensym1 = args...
    gensym2 = kwargs...
end) -> f(args..., kwargs)

Macros that use block syntax for things that aren’t true blocks. For example:

MacroTools.@match e begin
    a_ + b_ => :($b + $a)
    a_ => a
end

does not mean

MacroTools.@match e begin
    gensym1 = a_ + b_ => :($b + a)
    gensym2 = a_ => a
end

Part of this can be solved in a macro. 1) can be solved using special casing. 2) however cannot because the possibility of user defined macros. Therefore, I think for this to gain widespread usability, it needs to be implemented in lowering, which I gather occurs after macro expansion. I think this is the same way dot vectorization was implemented. I’d like to implement a PR, but I was hoping if anyone is excited about this they could either give me some advice to get started or collaborate? I don’t know lisp so it would be an uphill climb.

Tamas_Papp · February 27, 2017, 10:19am

I was not aware of this discussion, can you link it?

I tend to use _ as a stand-in for “values I don’t need”, eg

a, _, c = returns_three_values(...)

StefanKarpinski · February 27, 2017, 5:32pm

The way forward to make that official has been paved by deprecating the use of _ as an r-value in 0.6. It will likely be an official “discard this value” name in 1.0 when used as a l-value. That does not conflict with the proposed use here as an automatic “last value” binding, which would be handy for chaining computations together.

EDIT: To clarify, there has not been that much discussion of _ as a previous value binding, most of the discussion has been about _ as a discarded value binding.

bramtayl · February 27, 2017, 6:09pm

So it’s possible that if _ is used as an l-value it would mean “discard this value” but it used as a r-value it would mean “last value”?

bramtayl · February 27, 2017, 6:24pm

In fact, these use cases are entirely consistent. For example,

plus_neighbor(i) = i, i + 1

begin 
    1
    _, b = plus_neighbor(_)
    b + 1
end

would presumably get lowered to

plus_neighbor(i) = i, i + 1

begin 
    gensym1 = 1
    gensym2 = (gensym1, b) = plus_neighbor(gensym1)
    gensym3 = b + 1
end

And in fact this already works in ChainRecursive

using ChainRecursive

@chain begin
    1
    _, b = plus_neighbor(_)
    b + 1
end

stevengj · February 27, 2017, 7:25pm

As Stefan says, there actually hasn’t been much (if any) discussion of this. The discussion has been for using _ for discarded l-values.

I’m very skeptical of this proposal for _ as an r-value denoting the result of the previous expression. What problem does this solve? Can you give an example of code that would be made significantly clearer by using _ to denote the result of a previous expression?

bramtayl · February 27, 2017, 7:34pm

Hmm. Well, I’ve seen it mentioned a couple places, maybe I was exaggerating a bit. I’m not sure I want to rehash the discussion of the pros/cons of chaining; I was more looking for advice on how to implement one version of it.

Tamas_Papp · February 27, 2017, 7:41pm

Assigning a symbol different meaning depending on context would lead to subtle bugs IMO. Generally, I don’t think that using values of previous expressions via some special syntax is good practice; I recognize that ans is useful in the REPL (in case the computation is expensive and I forgot to assign the result to something, but this almost never happens), but I would prefer explicitly assigning and using values.

However, in case you think this is useful, it would be great to see an example of a language which does something similar. I have to admit that I can’t think of any.

bramtayl · February 27, 2017, 7:58pm

In fact, I tried to show above that the semantics of _ as “discard this value” are a strict subset of the semantics of _ as “last value”. Chaining is wildly popular in R and used extensively in the Hadleyverse.

yuyichao · February 27, 2017, 8:07pm

No it’s not. What are you going to do with

_, a = (1, 2)
b = _

Should _ be the last value, or the discarded value

What about

c = _, a = (1, 2)
b = _

Discarding a value has the property that you cannot misuse it, since it’ll be an error (currently a warning) if you do. A magic “last value” does not have that property and is much more error prone.

bramtayl · February 27, 2017, 9:14pm

Hmm, I think the behavior is pretty consistent.

_, a = (1, 2)
b = _

would go to

gensym1 = nothing
gensym2 = (gensym1, a) = (1, 2)
gensym3 = gensym2

c = _, a = (1, 2)
b = _

would go to

gensym1 = nothing
gensym2 = c = (gensym1, a) = (1, 2)
gensym3 = b = gensym3

yuyichao · February 27, 2017, 9:39pm

This is exactly the problem. It’s perfectly valid to assume that it’s

(gensym1, a) = (1, 2)
c = gensym1

instead.

bramtayl · February 27, 2017, 10:00pm

Hmm. I think the first behavior seems more intuitive.

yuyichao · February 27, 2017, 10:26pm

The second one (I assume this means the one in my post above) is much more consistent with using it as a normal variable.

kristoffer.carlsson · February 27, 2017, 11:24pm

Isn’t _ as an rvalue deprecated?

yuyichao · February 27, 2017, 11:55pm

The deprecation does free it up for being used as result of previous statement when used as rvalue which is what @bramtayl has been arguing for. The case I posted is just for showing that having it both as discarded value and value of previous statement are in principle non-conflicting but can be quite confusing.

bramtayl · February 28, 2017, 12:11am

I can see your point, it does seem confusing. I think the potential for a user to use _ in the line directly after an assignment is small. If users are assigning names to something, they’ll likely be using these names instead.

StefanKarpinski · February 28, 2017, 2:03pm

But the point is that allowing _ as both and r-value and an l-value but one which behaves completely differently from other variables is wildly confusing and adds a ton of corner cases to the language.

Tamas_Papp · February 28, 2017, 2:15pm

If you are thinking of the %>% operator in R, that is quite different from what you are suggesting, and equivalent or similar functionality is already provided by |> and some packages, in particular see

bramtayl · February 28, 2017, 3:01pm

Like I showed above, only two corner cases in Base, each of which is caused by Julia using block syntax for lists/arguments. In fact, I was kicking around another proposal to add a new “arguments” block syntax/AST node which would allow you to pass arguments to a function in block form. This alternate proposal would allow macros to work correctly and remove the need for chaining behavior in Base.

Topic		Replies	Views
Would the Scala convention for anonymous function arguments be feasible? Internals & Design	24	3106	December 16, 2016
Fixing the Piping/Chaining Issue (Rev 3) Internals & Design multithreading , syntax , piping , chaining , threading	89	7983	April 5, 2024
Fixing the Piping/Chaining/Partial Application Issue (Rev 2) Internals & Design proposal , piping , chaining , partial-evaluation , threading	40	4078	November 26, 2022
Partial Application brackets without underscores Internals & Design proposal , currying , partial-evaluation	5	1644	November 21, 2022
ANN: Underscores.jl: Placeholder syntax for closures Package Announcements data , syntax	49	4628	April 6, 2020

RFC: _ as ans

Related topics