Syntax Surprises

I always maintained that that was a mistake. Obvious code is good code. And vice versa.

3 Likes

I wanted to see if how JuliaSyntax.jl parsed these statements.

julia> using JuliaSyntax: JuliaSyntax, SyntaxNode, GreenNode

julia> JuliaSyntax.parse(SyntaxNode, "@show 1-2+3")
((toplevel (macrocall @show (call-i (call-i 1 - 2) + 3))), JuliaSyntax.Diagnostic[], 12)

julia> JuliaSyntax.parse(SyntaxNode, "@show 1 -2+3")
((toplevel (macrocall @show 1 (call-i -2 + 3))), JuliaSyntax.Diagnostic[], 13)

julia> JuliaSyntax.parse(SyntaxNode, "@show 1 -2 +3")
((toplevel (macrocall @show 1 -2 3)), JuliaSyntax.Diagnostic[], 14)

julia> JuliaSyntax.parse(SyntaxNode, "@show 1- 2+ 3")
((toplevel (macrocall @show (call-i (call-i 1 - 2) + 3))), JuliaSyntax.Diagnostic[], 14)

julia> JuliaSyntax.parse(SyntaxNode, "[<(5) cos]")
((toplevel (hcat (call < 5) cos)), JuliaSyntax.Diagnostic[], 11)

julia> JuliaSyntax.parse(SyntaxNode, "[cos <(5)]")
((toplevel (vect (call-i cos < 5))), JuliaSyntax.Diagnostic[], 11)

julia> JuliaSyntax.parse(SyntaxNode, "2 > 1 ? :Hello : :World")
((toplevel (? (call-i 2 > 1) (quote Hello) (quote World))), JuliaSyntax.Diagnostic[], 24)

julia> JuliaSyntax.parse(SyntaxNode, "if 2 > 1  s=:Hello  else  s=:World  end")
((toplevel (if (call-i 2 > 1) (block (= s (quote Hello))) (block (= s (quote World))))), JuliaSyntax.Diagnostic[], 40)

julia> JuliaSyntax.parse(SyntaxNode, raw"s = if 2 > 1  :Hello  else  :World  end")
((toplevel (= s (if (call-i 2 > (call-i 1 : Hello)) (block) (block (quote World))))), JuliaSyntax.Diagnostic[], 40)

julia> JuliaSyntax.parse(SyntaxNode, "f() = (local a, b = 1, 2; a+b)")
((toplevel (= (call f) (block (local (= (tuple a b) (tuple 1 2))) (call-i a + b)))), JuliaSyntax.Diagnostic[], 31)

julia> JuliaSyntax.parse(SyntaxNode, "f() = (a, b = 1, 2; a+b)")
((toplevel (= (call f) (tuple a (= b 1) 2 (parameters (call-i a + b))))), JuliaSyntax.Diagnostic[], 25)

julia> JuliaSyntax.parse(SyntaxNode, raw"""for x=1:3  print("$x ")  end""")
((toplevel (for (= x (call-i 1 : 3)) (block (call print (string x " "))))), JuliaSyntax.Diagnostic[], 29)

julia> JuliaSyntax.parse(SyntaxNode, raw"""for x=(1,2,3)  print("$x ")  end""")
((toplevel (for (= x (tuple 1 2 3)) (block (call print (string x " "))))), JuliaSyntax.Diagnostic[], 33)

julia> JuliaSyntax.parse(SyntaxNode, raw"""for x=1:3  (print("$x "))  end""")
((toplevel (for (= x (call-i 1 : 3)) (block (call print (string x " "))))), JuliaSyntax.Diagnostic[], 31)

julia> JuliaSyntax.parse(SyntaxNode, raw"""for x=(1,2,3)  (print("$x "))  end""")
((toplevel (for (= x (call (tuple 1 2 3) (error-t) (call print (string x " ")))) (block))), JuliaSyntax.Diagnostic[JuliaSyntax.Diagnostic(14, 15, :error, "whitespace is not allowed here")], 35)

julia> JuliaSyntax.parse(SyntaxNode, raw"(4)(4)")
((toplevel (call-i 4 * 4)), JuliaSyntax.Diagnostic[], 7)

julia> JuliaSyntax.parse(SyntaxNode, raw"(4)(2*2)")
((toplevel (call-i 4 * (call-i 2 * 2))), JuliaSyntax.Diagnostic[], 9)

julia> JuliaSyntax.parse(SyntaxNode, raw"(2*2)(4)")
((toplevel (call (call-i 2 * 2) 4)), JuliaSyntax.Diagnostic[], 9)

julia> JuliaSyntax.parse(SyntaxNode, raw"+(1, 2, 3)")
((toplevel (call + 1 2 3)), JuliaSyntax.Diagnostic[], 11)

julia> JuliaSyntax.parse(SyntaxNode, raw"-(1, 2, 3)")
((toplevel (call - 1 2 3)), JuliaSyntax.Diagnostic[], 11)

julia> JuliaSyntax.parse(SyntaxNode, raw"if 1 > 2  A=[1 2; 3 4]  else A=[4 3; 2 1] end")
((toplevel (if (call-i 1 > 2) (block (= A (vcat (row 1 2) (row 3 4)))) (block (= A (vcat (row 4 3) (row 2 1)))))), JuliaSyntax.Diagnostic[], 46)


julia> JuliaSyntax.parse(SyntaxNode, raw"A = if 1 > 2  [1 2; 3 4]  else [4 3; 2 1] end")
((toplevel (= A (if (call-i 1 > (typed_vcat 2 (error-t) (row 1 2) (row 3 4))) (block) (block (vcat (row 4 3) (row 2 1)))))), JuliaSyntax.Diagnostic[JuliaSyntax.Diagnostic(13, 14, :error, "whitespace is not allowed here")], 46)
1 Like

I absolutely do find this notation useful and attractive. It’s certainly not about coding speed at all, but about readability. I especially prefer

(2x + 4)/3a

over

(2*x + 4)/(3*a) 

The more complicated the expression, the greater the utility of literal coefficients and unicode identifiers.

Would anyone ever write 4(a) or 4(4), except as an experiment? Allowing or disallowing it wouldn’t bother me, no one would use it anyway.

8 Likes

I think the meaning of 4a is obvious.

4 Likes

From an a to an e is just a few keys on the keyboard. And it isn’t just e:

julia> f = 2
2

julia> 3f-3
3.00000e-03

julia> 3f - 3
3
6 Likes

Another little inconsistency:

julia> e,E,f,F = 3,3,3,3
(3, 3, 3, 3)

julia> 3e-3, 3f-3, 3E-3, 3F-3
(0.003, 0.003f0, 0.003, 6)

'e' is case insensitive, while 'f' is case sensitive.
(aside: I like 2a == 2*a , but Float32 literals can change)

12 Likes

Why not 2(x+2)/3a?

This is kinda cursed ngl

Interestingly, the Discourse syntax highlighting [incorrectly] shows 3F-3 as a numeric literal, but my VSCode syntax highlighting [correctly] doesn’t.

(also, I should pick different colors in VSCode; the color of numeric literals is too close for comfort to variables.)

1 Like

To be wild here, what if we did away with 2e9 notation instead? It’s not a great syntax. It’s not obvious at all and people often expect it to produce an integer. You could make 2×10^9 a literal syntax instead, which is much more obvious and could produce an integer. If we disallowed 2e9 and required 2.0e9 for the float literal and also disallowed float literal juxtaposition then there wouldn’t be any more syntax ambiguities.

14 Likes

Since you are a newcomer here, I should probably point you to this post: PSA: Julia is not at that stage of development anymore

62 Likes

:smile: I mean as a 2.0 thing, of course. But yeah, maybe too disruptive. Of course literal syntax changes are the easiest thing to automatically update. We could also have syntax changes be module-locally opt-in pre 2.0 and opt-out post 2.0.

12 Likes

I like this. I’d opt-in just for the peace of mind.

More times have I accidentally used it hoping to declare a large integer, than actually wanting a float.

this gives me mild pause, as I wouldn’t be able to write 0.25x or 2.5y, but I think it’s worth it to maintain the ability to write 1.602e-19. Thumbs up from me, for whatever that’s worth.

I am now deceased.

Edit:

I’m not sure how difficult it’d be, but for float literal juxtaposition I wouldn’t be opposed to parentheses-wrapped float literals. Namely, 2.5y might no longer work, but (2.5)y could still be allowed.

Currently this works: 1.602e-19n, but it really feels like it should be wrapped in parentheses, e.g. (1.602e-19)n.

3 Likes

This is an impressive array of syntax oddities you’ve got here.

  1. Yes, we have syntactically significant whitespace for macros and array construction / concat
  2. This is unfortunate. The former is allowed because the <(5) is the function call you want and the whitespace rules allow spaces for separators in hcat. The latter is parsed this way because < is considered a binary operator and the () are ignored as a possible function call. Possibly this could be fixed, maybe. Whitespace sensitive parsing is a bit of a nightmare of special cases.
  3. The : acts as a range operator here, which has higher precedence than the > in the if. Nothing we can do about this I suspect.
  4. This is kind of a trick! The parser accepts this latter case (with (a, b = 1, 2; a+b) as a “frakentuple” JuliaSyntax.jl/parser.jl at eb9fd054b454b158aa17c6ebe3f9d5fd584faf47 · JuliaLang/JuliaSyntax.jl · GitHub) and it’s actually a lowering error you see here. There are tricky edge cases in how commas and semicolons inside parens are disambiguated to give tuples vs named tuples vs frankentuples vs plain old blocks. The local works because it gobbles up the commas and makes the inside of the parens into a block not a tuple.
  5. The error here is the same one you get for writing f(x) as (f) (x): the parser considers the trailing ( to be function call syntax, but it’s not allowed with whitespace before the (
  6. Yeah I personally don’t like juxtaposition as multiplication, but for better or worse we have it. I think we could potentially disallow forms with “excess” parentheses but careful testing would be required to check how much breakage it causes. I currently test JuliaSyntax over all of General so we have some tooling for this.
  7. Others have covered this one. It isn’t a matter of syntax
  8. This is basically similar to 5 - precedence is not what you expect. The error you get is the same one you get for Int [1 2] vs Int[1 2]. Ie the parser is trying to parse typed_hcat here.

Meta comments about compatibility and syntax evolution

JuliaSyntax.jl aims to parse the vast majority of Julia syntax compatibly, with exceptions only when

  • We find what seems to be a bug in the reference parser
  • Fixing this bug is not disruptive over a large test suite (essentially the General registry)

So there’s various obscure and somewhat unfortunate behaviors in the reference parser which we replicate to avoid breaking things.

In the future I feel we could deprecate or change certain obscure and problematic syntax on some kind of per-project or per-module opt-in basis, eventually leading up to making it the default in Julia 2.0. If it’s opt-in, old code continues working but the tricky part is driving adoption so that there’s some mechanism to roll those changes out and shake out bugs. Hmm. Maybe we could infer syntax variant based on the Julia version range declared in the project file? Then the roll over to Julia 2.0 would just be a special case of a general system for rolling minor syntax updates.

Even if there’s a system of syntax variants, it’s important that all packages which would otherwise work together continue to work together. No splitting the ecosystem :slight_smile:

8 Likes

Speaking of excess parentheses, there’s various places these are allowed and I have very mixed feelings about. An extreme example with juxtaposition

julia> (((((((6)))))))(((((((7)))))))
42

And a fun one with keyword argument syntax

julia> f(;kws...) = kws
f (generic function with 1 method)

julia> f(a=1)  # ok sure
pairs(::NamedTuple) with 1 entry:
  :a => 1

julia> f(((((((((a=1))))))))) # now what?
pairs(::NamedTuple) with 1 entry:
  :a => 1
1 Like

Good point, but for this, 2*(x+2)/3a is just as nice. Outside parentheses it becomes weird, imo.

That’s a bit unfortunate.

Having an easy notation for this is really important, imo, and I do miss a similarly convenient notation for integers. Perhaps 2e9 could be an integer, and 2.0e9 would be the float. Pretty disruptive, though…

2.5*y is much nicer than (2.5)y, so the point of it is gone.

no no, as proposed above we might be required to have “excess” parentheses if we wish to have floating-point juxtaposition, in order to eliminate ambiguities.

We can construct similar in the flag-bearer for syntactical restraint, so I think the ability to have excess parens isn’t a big deal:

>>> def myfun(x):
...     return x
...
>>> myfun((((((((((((((((1))))))))))))))))
1

What if we take the inverse? 3a/2(x+2) versus 3a/(2*(x+2)) or 3a/2/(x+2)?

That defeats the purpose of the proposal, which would make it so that 2e9 unambiguously scales a variable e9 by a constant integer 2. Currently the mere presence of 2e9 (and 2e+9 and 2e-9) as a numeric literal causes some pretty sketchy ambiguity.

So by the proposal:

2e9 == (2)e9 == 2(e9) == (2)(e9) == 2 * e9
2.0e9 == (2.0)10^9 == 2.0(10^9) == (2.0)(10^9) == 2.0 * 10^9
2×10^9 == (2)10^9 == 2(10^9) == (2)(10^9) == 2 * 10^9

I don’t know that it’s entirely gone though. I’d argue that (1.602e-19)n is clearer to read than either 1.602e-19*n or (1.602e-19)*n. Also, you may (or may not) have positive feelings towards the syntax of constant-scaled denominators as above. I think it’s worth it to eliminate the ambiguity.

Also the rule doesn’t have to extend to macros, so Unitful.jl 9.8u"m/s^2" could stay unchanged.

Yeah, I thought of that. But something/2(x+2) looks ambiguous, unlike something/2x. Even in hand-written math notation I would hesitate to use y/2(x+2), prefering instead y/(2(x+2)) or \frac{y}{2(x+2)}.

Yes, but that would be terrible, since 2e9 (or 2.0e9) is extremely useful, even more than the juxtaposition syntax itself.

But 1.602e-19 * n is better than either. The point of the juxtaposition syntax is aesthetic, so it really only makes sense to use it when it improves legibility. The examples provided seem to be specifically designed to diminish legibility.

You are basically trying to come up with the worst ways to abuse notation as an argument to remove it, while most use it for the opposite purpose. It think it would be really sad if we remove something that can be very helpful because it can be used in a perverse way. There are endless opportunities for writing horrible, unreadable code, I don’t see why this (very nice) syntax is particularly problematic.

2 Likes

A simple rule could be:

Implicit multiplication syntax looks like this: [numeric literal][identifier OR function call]

This would only allow code like 4a, 4rand(), 4.0b, 4.0randn(4). No (4)a, no 4(a), no arbitrary expressions after the numeric literal like 4(a + b).

Surprised noone suggested something Unicode yet. So here goes:

julia> ⏨(i) = i < 0 ? 10.0^i : 10^i
⏨ (generic function with 1 method)

julia> 2⏨(5)
200000

For some reason the code snippet font displays ⏨ as ⏨ in superscript. The keyboard input description could be \tento.

This isn’t a serious suggestion, but does reduce ambiguity.

1 Like

Now, this is what I call non-obvious syntax.

julia> x=3                                                                                 
3                                                                                          
                                                                                           
julia> a=2                                                                                 
2                                                                                          
                                                                                           
julia> (2x + 4)/3a                                                                         
1.66667e+00                                                                                
                                                                                           
julia> (2x + 4)/3*a                                                                        
6.66667e+00