Proposal to extend the syntax of list comprehensions with a `let` clause

HanD · September 14, 2022, 5:21pm

Hi @StefanKarpinski, I’m glad you like the idea! I would be happy to create an issue for it, I assume you mean on github, right? As @non-Jedi kindly pointed out in his reply, there already exists an issue with a similar suggestion (issue 32905), and he posted my suggestion there in a comment. Do you think it would help things move forward if I created a separate issue for this?

StefanKarpinski · September 14, 2022, 5:47pm

No, that’s quite sufficient! Sorry that I missed that.

aplavin · September 14, 2022, 9:22pm

It’s not specific to transducers, Base map + filter + map would do exactly the same number of calls.

Jollywatt · September 15, 2022, 4:33am

The more I think about this proposal, the better it seems. Here’s my silly attempt at illustrating how the syntax is natural, reiterating points above.

Currently, comprehension syntax is (essentially) of the form [<body> <for> (<for>|<if>)+]; i.e., after an initial for to disambiguate comprehension syntax, any number of fors or ifs may follow. Proceeding left-to-right is like stepping deeper. Illustrating with a contrived pseudo-Julia example:

[body for x in X if P(x) for y in Y, z in Z if Q(y, z)]
# is equivalent to...
for x in X
  if P(x)
    for y in Y, z in Z
      if Q(y, z)
        @yield body
      end
    end
  end
end

Of course, this glosses over details, including how the shape of the resulting array is chosen: a vector except for the simplest case [body for x in Y, y in Y, ...] where shape is the same as the Cartesian product of the iterators.

@HanD’s proposal is simply to allow [<body> <for> (<for>|<if>|<let>)+] so that, for example:

[body for x in X let y = f(x) if P(x, y) for z in Z]
# is equivalent to...
for x in X
  let y = f(x)
    if P(x, y)
      for z in Z
        @yield body
      end
    end
  end
end

The proposed behaviour can be simulated by rewriting let x = X as for x in Ref(X) or for x in (X,) except possibly for the special case mentioned above, where it might be reasonable to expect

julia> [x12 for x1 in 1:2, x2 in 1:3 let x12 = (x1, x2)]
2×3 Array{Tuple{Int64,Int64},2}:
 (1, 1)  (1, 2)  (1, 3)
 (2, 1)  (2, 2)  (2, 3)

instead of

julia> [x12 for x1 in 1:2, x2 in 1:3 for x12 = Ref((x1, x2))]
6-element Array{Tuple{Int64,Int64},1}:
 (1, 1)
 (2, 1)
 (1, 2)
 (2, 2)
 (1, 3)
 (2, 3)

Tamas_Papp · September 15, 2022, 8:22am

As noted in the linked issue, you can still use list comprehension syntax, eg just nest it:

[2z for z in (abs(x) for x in -20:20) if z > 10]

Variants of this can filter on f(x) while computing g(x), eg

[g(z.x) for z in ((x, y = f(x)) for x in itr) if z.y]

While I understand that syntactic sugar can be convenient, note that at this moment the list comprehension only has support for a particular combination that is lowered to Base.Iterators.Generator and Base.Iterators.Filter in a specified order.

There are a ton of other things you can do to iterators, and it is unclear why the language should special-case this additional combination and not the others. Instead of extending list comprehensions, I think it is better to explore alternatives which allow building up complex combinations of operators on iterators.

Jollywatt · September 15, 2022, 9:03am

@Tamas_Papp has a good point which makes the proposal moot.

A “let statement” in a comprehension is always achievable with a nested generator:

[body for x in X let y = f(x) if P(x, y)]
# is equivalent to...
[body for (x, y) in ((x, f(x)) for x in X) if P(x, y)]

Elegant!

Tamas_Papp · September 15, 2022, 9:45am

It would be great to add this and similar advanced examples to the manual though.

adienes · September 15, 2022, 12:28pm

The first formulation is more readable though—perhaps it should be added anyway?

HanD · September 15, 2022, 12:37pm

While the nested generator does the job, I personally find it a lot more difficult to parse than the suggested syntax. More parentheses, more for keywords, more x’s (which denote multiple variables, btw., even if they have the same value).

StefanKarpinski · September 16, 2022, 1:23pm

There are certainly other ways to express this but they’re less clear and more contrived and one of my favorite arguments applies: this is currently a syntax error and what else would it mean? The only down side to adding syntax is that more of it adds syntactic (but not semantic) complexity to the language which one does want to limit.

StevenWhitaker · September 16, 2022, 5:27pm

Would the proposed syntax be unclear in this example?

[2abs_x
 for x in -20:20
 let abs_x = abs(x)
 if x > 10]

I could see someone thinking that abs_x would only be assigned if the condition x > 10 holds (and thus an UndefVarError would be thrown when x <= 10 holds). So my question is: is it clear that the if clause limits the values that are included in the comprehension, now that the if is not immediately after the for? Or should this example be rewritten as

[2abs_x
 for x in -20:20
 if x > 10
 let abs_x = abs(x)]

? (Though I’m not convinced this is necessarily any more clear if the first case isn’t clear, and this ordering wouldn’t make sense for the example in the OP.)

jar1 · September 16, 2022, 6:14pm

It is currently a syntax error, but there are other ways of expanding the syntax that could be more beneficial.

Comprehension syntax has a lot of theory behind it which has been exploited in Haskell. Any expansion of that syntax should be consistent with the lessons from Haskell’s version to ensure that features can be added in the future.

Generalized list comprehensions

Monad comprehensions

davidavdav · September 18, 2022, 8:18am

Hello,

I think in Atlas a comprehension syntax is used that simply extends the normal loops, by yielding the last evaluated statement. In Julia this might look like

result = for x in -20:20
  abs_x = abs(x)
  if abs_x > 10
    2abs_x
  end
end

tbeason · September 18, 2022, 1:06pm

Agreed, this could be nice to have:

But also agree that if your comprehension is so complicated that it requires line-splitting, probably you should start looking for a different tool.

HanD · September 19, 2022, 6:40am

I see. The thing is that this syntax has an implicit, hidden filter for nothing values in it, i.e., for each item where the condition fails, and implicit nothing is returned. You can write the same in Julia using the do block and an explicit filter, though:

result = map(-20:20) do x
  abs_x = abs(x)
  if abs_x > 10
    2abs_x
  end
end |> v -> filter!(!isnothing, v)

This is perfectly valid Julia code, but again, it is less concise than I’d prefer.

HanD · September 19, 2022, 6:50am

Interesting point. For me, at least, it would be clear, as I would parse the terms from left to right, as usual.

If the let appeared before the if, then I’d know the value declared there is already defined in the condition.

If, on the other hand, the if appeared before the let (your second example), I’d know that the expression on the right-hand side of the let clause is only evaluated for those values which satisfy the condition. I’d also know that the declared value is only accessible in the comprehension body.

And the body of the comprehension syntax is the last to be evaluated. So I’d assume that it is evaluated for those values only which satisfy all conditions (in case there are multiple if’s), and can use the values declared by all let clauses.

But if the order of evaluation is unclear to you, perhaps it can be unclear to others as well.

GunnarFarneback · September 19, 2022, 7:59am

My opinion is different: unless the comprehension is very simple, you should line-split on the for and a possible if. I would start looking for a different tool if one of those parts were too complicated to fit on a single line.

davidavdav · September 19, 2022, 9:30am

Yes, you’re absolutely right, so the syntax addition would need to filter statements that do not give a result, just like the current comprehension [ ... for .. if predicate ] filters for the predicate. I think the idea would be that the for ... ; statement end more naturally supports multi-line code, whereas the [ ... ] comprehension syntax reads better for one-lined expressions.

Of course, if you would want to collect nothings in the proposed syntax that would then probably not work.

As for the map() do syntax, I personally haven’t used that often enough to immediately see what is going on, but that is my personal problem. With an additional definition mapfilt(f::Function, x) = map(f, x) |> v -> filter!(!isnothing, v) your solution reduces to the somewhat easier to read mapfilt(-20:20) do x, but I see that the return type still includes a union with Nothing.

StevenWhitaker · September 19, 2022, 6:14pm

How you explained it is how I would understand the proposed syntax as well. I was just trying to come up with a situation that someone might find confusing. I personally think the proposed syntax is quite intuitive.