RFC: Non-standard string literals should be quotable

While working on tree-sitter-julia, I’ve found some inconsistencies in the way Julia treats non-standard string literals versus similar syntactic forms of the language.

In Julia, you can quote identifiers, literals and even macro calls:

julia> :symbol
:symbol

julia> :"my string"
"my string"

julia> :@mac a b
:(#= REPL[3]:1 =# @mac a b)

julia> :@mac(a, b)
:(#= REPL[4]:1 =# @mac a b)

However, you cannot quote non-standard string literals:

julia> :r"foo+"
ERROR: syntax: extra token """ after end of expression
Stacktrace:
 [1] top-level scope
   @ none:1

There doesn’t seem to be any practical reason to forbid this form, and it’s simpler for declarative parsers (like tree-sitter and lezer) to treat it as valid. I want to bring up this issue because I found a couple of examples that suggest it should be allowed.

  1. There’s only one scenario where allowing this quoting would be a breaking change. If you pass a “quoted non-standard string literal” as a macro call argument, you get:

    julia> :@mac 1 :r"foo+"
    :(#= REPL[5]:1 =# @mac 1 :r "foo+")
    

    I.e. it’s parsed as two separate arguments, a symbol and a string. Strangely, this doesn’t happen in matrices:

    julia> [1 :r"foo+"]
    ERROR: syntax: expected "]" or separator in arguments to "[ ]"; got ":r""
    Stacktrace:
     [1] top-level scope
       @ none:1
    

    I always assumed that macro whitespace and array whitespace work the same way, but I could be wrong. If they do work the same, then this indicates one of them is wrong.

  2. the parser built-in var can be quoted:

    julia> :var"."
    :.
    
    julia> :var"#"
    Symbol("#")
    

    Probably because var gets lowered before Julia realizes it shouldn’t allow this (again, no reason to disallow it). If the goal of var is to fake being a non-standard string literal, then it should behave as other non-standard string literals.

The expected behavior should be to parse the quotation as if there were parentheses surrounding the string:

(future julia)> :r"foo+"i == :(r"foo+"i)

These are edge cases of the flisp parser implementation that I haven’t seen mentioned in the JuliaSyntax.jl Design Discussion. It’d be better to fix this, instead of letting it become another item on the list.

This change wouldn’t be disruptive at all, even considering it might be a breaking change. In turn it would make the language more consistent.

I’m opening this thread hoping to discuss this proposal. If this change were to be accepted, I’d be interested in implementing it myself in JuliaSyntax.jl, and I’d also need some help to implement it in the flisp parser.

1 Like