Hi @StefanKarpinski, I’m glad you like the idea! I would be happy to create an issue for it, I assume you mean on github, right? As @non-Jedi kindly pointed out in his reply, there already exists an issue with a similar suggestion (issue 32905), and he posted my suggestion there in a comment. Do you think it would help things move forward if I created a separate issue for this?
No, that’s quite sufficient! Sorry that I missed that.
It’s not specific to transducers, Base map + filter + map
would do exactly the same number of calls.
The more I think about this proposal, the better it seems. Here’s my silly attempt at illustrating how the syntax is natural, reiterating points above.
Currently, comprehension syntax is (essentially) of the form [<body> <for> (<for>|<if>)+]
; i.e., after an initial for
to disambiguate comprehension syntax, any number of for
s or if
s may follow. Proceeding left-to-right is like stepping deeper. Illustrating with a contrived pseudo-Julia example:
[body for x in X if P(x) for y in Y, z in Z if Q(y, z)]
# is equivalent to...
for x in X
if P(x)
for y in Y, z in Z
if Q(y, z)
@yield body
end
end
end
end
Of course, this glosses over details, including how the shape of the resulting array is chosen: a vector except for the simplest case [body for x in Y, y in Y, ...]
where shape is the same as the Cartesian product of the iterators.
@HanD’s proposal is simply to allow [<body> <for> (<for>|<if>|<let>)+]
so that, for example:
[body for x in X let y = f(x) if P(x, y) for z in Z]
# is equivalent to...
for x in X
let y = f(x)
if P(x, y)
for z in Z
@yield body
end
end
end
end
The proposed behaviour can be simulated by rewriting let x = X
as for x in Ref(X)
or for x in (X,)
except possibly for the special case mentioned above, where it might be reasonable to expect
julia> [x12 for x1 in 1:2, x2 in 1:3 let x12 = (x1, x2)]
2×3 Array{Tuple{Int64,Int64},2}:
(1, 1) (1, 2) (1, 3)
(2, 1) (2, 2) (2, 3)
instead of
julia> [x12 for x1 in 1:2, x2 in 1:3 for x12 = Ref((x1, x2))]
6-element Array{Tuple{Int64,Int64},1}:
(1, 1)
(2, 1)
(1, 2)
(2, 2)
(1, 3)
(2, 3)
As noted in the linked issue, you can still use list comprehension syntax, eg just nest it:
[2z for z in (abs(x) for x in -20:20) if z > 10]
Variants of this can filter on f(x)
while computing g(x)
, eg
[g(z.x) for z in ((x, y = f(x)) for x in itr) if z.y]
While I understand that syntactic sugar can be convenient, note that at this moment the list comprehension only has support for a particular combination that is lowered to Base.Iterators.Generator
and Base.Iterators.Filter
in a specified order.
There are a ton of other things you can do to iterators, and it is unclear why the language should special-case this additional combination and not the others. Instead of extending list comprehensions, I think it is better to explore alternatives which allow building up complex combinations of operators on iterators.
@Tamas_Papp has a good point which makes the proposal moot.
A “let
statement” in a comprehension is always achievable with a nested generator:
[body for x in X let y = f(x) if P(x, y)]
# is equivalent to...
[body for (x, y) in ((x, f(x)) for x in X) if P(x, y)]
Elegant!
It would be great to add this and similar advanced examples to the manual though.
The first formulation is more readable though—perhaps it should be added anyway?
While the nested generator does the job, I personally find it a lot more difficult to parse than the suggested syntax. More parentheses, more for keywords, more x
’s (which denote multiple variables, btw., even if they have the same value).
There are certainly other ways to express this but they’re less clear and more contrived and one of my favorite arguments applies: this is currently a syntax error and what else would it mean? The only down side to adding syntax is that more of it adds syntactic (but not semantic) complexity to the language which one does want to limit.
Would the proposed syntax be unclear in this example?
[2abs_x
for x in -20:20
let abs_x = abs(x)
if x > 10]
I could see someone thinking that abs_x
would only be assigned if the condition x > 10
holds (and thus an UndefVarError
would be thrown when x <= 10
holds). So my question is: is it clear that the if
clause limits the values that are included in the comprehension, now that the if
is not immediately after the for
? Or should this example be rewritten as
[2abs_x
for x in -20:20
if x > 10
let abs_x = abs(x)]
? (Though I’m not convinced this is necessarily any more clear if the first case isn’t clear, and this ordering wouldn’t make sense for the example in the OP.)
It is currently a syntax error, but there are other ways of expanding the syntax that could be more beneficial.
Comprehension syntax has a lot of theory behind it which has been exploited in Haskell. Any expansion of that syntax should be consistent with the lessons from Haskell’s version to ensure that features can be added in the future.
Hello,
I think in Atlas a comprehension syntax is used that simply extends the normal loops, by yielding the last evaluated statement. In Julia this might look like
result = for x in -20:20
abs_x = abs(x)
if abs_x > 10
2abs_x
end
end
Agreed, this could be nice to have:
But also agree that if your comprehension is so complicated that it requires line-splitting, probably you should start looking for a different tool.
I see. The thing is that this syntax has an implicit, hidden filter for nothing
values in it, i.e., for each item where the condition fails, and implicit nothing
is returned. You can write the same in Julia using the do
block and an explicit filter, though:
result = map(-20:20) do x
abs_x = abs(x)
if abs_x > 10
2abs_x
end
end |> v -> filter!(!isnothing, v)
This is perfectly valid Julia code, but again, it is less concise than I’d prefer.
Interesting point. For me, at least, it would be clear, as I would parse the terms from left to right, as usual.
If the let
appeared before the if
, then I’d know the value declared there is already defined in the condition.
If, on the other hand, the if
appeared before the let
(your second example), I’d know that the expression on the right-hand side of the let
clause is only evaluated for those values which satisfy the condition. I’d also know that the declared value is only accessible in the comprehension body.
And the body of the comprehension syntax is the last to be evaluated. So I’d assume that it is evaluated for those values only which satisfy all conditions (in case there are multiple if
’s), and can use the values declared by all let
clauses.
But if the order of evaluation is unclear to you, perhaps it can be unclear to others as well.
My opinion is different: unless the comprehension is very simple, you should line-split on the for
and a possible if
. I would start looking for a different tool if one of those parts were too complicated to fit on a single line.
Yes, you’re absolutely right, so the syntax addition would need to filter statements that do not give a result, just like the current comprehension [ ... for .. if predicate ]
filters for the predicate. I think the idea would be that the for ... ; statement end
more naturally supports multi-line code, whereas the [ ... ]
comprehension syntax reads better for one-lined expressions.
Of course, if you would want to collect nothing
s in the proposed syntax that would then probably not work.
As for the map() do
syntax, I personally haven’t used that often enough to immediately see what is going on, but that is my personal problem. With an additional definition mapfilt(f::Function, x) = map(f, x) |> v -> filter!(!isnothing, v)
your solution reduces to the somewhat easier to read mapfilt(-20:20) do x
, but I see that the return type still includes a union with Nothing
.
How you explained it is how I would understand the proposed syntax as well. I was just trying to come up with a situation that someone might find confusing. I personally think the proposed syntax is quite intuitive.