There’s been a lot of discussion lately of using _ as stand in for the result of the previous evaluation, for example:
a
_ + 1
_ + 2
would get lowered to
gensym1 = a
gensym2 = gensym1 + 1
gensym3 = gensym2 + 2
I’ve got something close to this up and working in ChainRecurisve.jl in a surprisingly small amount of code. There are only two remaining issues:
- Base syntax that is parsed as a block but would need to special case opt out:
type a
b::Int
c::Int
end
does not mean
type a
gensym1 = b::Int
gensym2 = c::Int
end
and
(args...; kwargs...) -> f(args..., kwargs)
does not mean
(begin
gensym1 = args...
gensym2 = kwargs...
end) -> f(args..., kwargs)
- Macros that use block syntax for things that aren’t true blocks. For example:
MacroTools.@match e begin
a_ + b_ => :($b + $a)
a_ => a
end
does not mean
MacroTools.@match e begin
gensym1 = a_ + b_ => :($b + a)
gensym2 = a_ => a
end
Part of this can be solved in a macro. 1) can be solved using special casing. 2) however cannot because the possibility of user defined macros. Therefore, I think for this to gain widespread usability, it needs to be implemented in lowering, which I gather occurs after macro expansion. I think this is the same way dot vectorization was implemented. I’d like to implement a PR, but I was hoping if anyone is excited about this they could either give me some advice to get started or collaborate? I don’t know lisp so it would be an uphill climb.
2 Likes
I was not aware of this discussion, can you link it?
I tend to use _
as a stand-in for “values I don’t need”, eg
a, _, c = returns_three_values(...)
1 Like
The way forward to make that official has been paved by deprecating the use of _
as an r-value in 0.6. It will likely be an official “discard this value” name in 1.0 when used as a l-value. That does not conflict with the proposed use here as an automatic “last value” binding, which would be handy for chaining computations together.
EDIT: To clarify, there has not been that much discussion of _
as a previous value binding, most of the discussion has been about _
as a discarded value binding.
So it’s possible that if _ is used as an l-value it would mean “discard this value” but it used as a r-value it would mean “last value”?
In fact, these use cases are entirely consistent. For example,
plus_neighbor(i) = i, i + 1
begin
1
_, b = plus_neighbor(_)
b + 1
end
would presumably get lowered to
plus_neighbor(i) = i, i + 1
begin
gensym1 = 1
gensym2 = (gensym1, b) = plus_neighbor(gensym1)
gensym3 = b + 1
end
And in fact this already works in ChainRecursive
using ChainRecursive
@chain begin
1
_, b = plus_neighbor(_)
b + 1
end
As Stefan says, there actually hasn’t been much (if any) discussion of this. The discussion has been for using _
for discarded l-values.
I’m very skeptical of this proposal for _
as an r-value denoting the result of the previous expression. What problem does this solve? Can you give an example of code that would be made significantly clearer by using _
to denote the result of a previous expression?
2 Likes
Hmm. Well, I’ve seen it mentioned a couple places, maybe I was exaggerating a bit. I’m not sure I want to rehash the discussion of the pros/cons of chaining; I was more looking for advice on how to implement one version of it.
Assigning a symbol different meaning depending on context would lead to subtle bugs IMO. Generally, I don’t think that using values of previous expressions via some special syntax is good practice; I recognize that ans
is useful in the REPL (in case the computation is expensive and I forgot to assign the result to something, but this almost never happens), but I would prefer explicitly assigning and using values.
However, in case you think this is useful, it would be great to see an example of a language which does something similar. I have to admit that I can’t think of any.
1 Like
In fact, I tried to show above that the semantics of _ as “discard this value” are a strict subset of the semantics of _ as “last value”. Chaining is wildly popular in R and used extensively in the Hadleyverse.
No it’s not. What are you going to do with
_, a = (1, 2)
b = _
Should _
be the last value, or the discarded value
What about
c = _, a = (1, 2)
b = _
Discarding a value has the property that you cannot misuse it, since it’ll be an error (currently a warning) if you do. A magic “last value” does not have that property and is much more error prone.
3 Likes
Hmm, I think the behavior is pretty consistent.
_, a = (1, 2)
b = _
would go to
gensym1 = nothing
gensym2 = (gensym1, a) = (1, 2)
gensym3 = gensym2
c = _, a = (1, 2)
b = _
would go to
gensym1 = nothing
gensym2 = c = (gensym1, a) = (1, 2)
gensym3 = b = gensym3
This is exactly the problem. It’s perfectly valid to assume that it’s
(gensym1, a) = (1, 2)
c = gensym1
instead.
1 Like
Hmm. I think the first behavior seems more intuitive.
The second one (I assume this means the one in my post above) is much more consistent with using it as a normal variable.
Isn’t _
as an rvalue deprecated?
The deprecation does free it up for being used as result of previous statement when used as rvalue which is what @bramtayl has been arguing for. The case I posted is just for showing that having it both as discarded value and value of previous statement are in principle non-conflicting but can be quite confusing.
1 Like
I can see your point, it does seem confusing. I think the potential for a user to use _ in the line directly after an assignment is small. If users are assigning names to something, they’ll likely be using these names instead.
But the point is that allowing _
as both and r-value and an l-value but one which behaves completely differently from other variables is wildly confusing and adds a ton of corner cases to the language.
2 Likes
If you are thinking of the %>%
operator in R, that is quite different from what you are suggesting, and equivalent or similar functionality is already provided by |>
and some packages, in particular see
Like I showed above, only two corner cases in Base, each of which is caused by Julia using block syntax for lists/arguments. In fact, I was kicking around another proposal to add a new “arguments” block syntax/AST node which would allow you to pass arguments to a function in block form. This alternate proposal would allow macros to work correctly and remove the need for chaining behavior in Base.