A question about parsing (Markdown) string literals

Hi everyone,

I’ve tried to generate a Markdown.MD object from an computed string and observed something odd (MWE):

julia> using Markdown

julia> md"foo\
       bar"
  foo
  bar

julia> md"foo\\\nbar"
  foo\\nbar

julia> test = "foo\\\nbar"
"foo\\\nbar"

julia> @md_str "foo\\\nbar"
  foo
  bar

julia> Markdown.parse(test)
  foo
  bar

julia> @md_str test
ERROR: LoadError: MethodError: no method matching parse(::Symbol; flavor::Symbol)

Closest candidates are:
  parse(::AbstractString; flavor)
   @ Markdown C:\Users\mail\.julia\juliaup\julia-1.10.4+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Markdown\src\Markdown.jl:31

Can anybody explain to me why the last call to the macro fails, but calling it on the string literal directly works?
Also, would it make sense to declare Markdown.parse as part of the public API for similar use cases?
I’ve stumbled across this while trying to get a line break when using

julia> using Markdown

julia> test = "foo\\\nbar"
"foo\\\nbar"

julia> md"$test"
"foo\\\nbar"

I understand that in some string literals having recursive parsing would be a bad choice (e.g., regular expressions), but I’m guessing in some cases, it might make sense to enable or disable it via an API. I couldn’t find a standardized way how this is handled, though.
E.g., in HypertextLiteral.jl there is a difference between

htl"foobar"

and

@htl "foobar"

with the latter enabling recursive descent by default. Would something similar make sense for certain use cases in Stdlib as well?

Thanks,
Michael

Edit: Should this have been posted to Internals & Design?

Can anybody explain to me why the last call to the macro fails, but calling it on the string literal directly works?

The macro call @md_str test fails because macros don’t take values, they operate at the syntax level. In this case, the macro only “sees” the symbol test, it cannot see that the value of test is "foo\\\nbar".

Also, would it make sense to declare Markdown.parse as part of the public API for similar use cases?

I don’t know. I do find it weird that the Markdown docs explain Julia’s bespoke markdown dialect, but they don’t actually document anything about the API of the package. Like… there’s not even a reference to @md_str :face_with_raised_eyebrow:

I didn’t understand that last part tho. Do you have an example? I’m pretty sure HypertextLiteral was written so @htl_str behaved like @htl.

The macro call @md_str test fails because macros don’t take values, they operate at the syntax level. In this case, the macro only “sees” the symbol test, it cannot see that the value of test is "foo\\\nbar".

Thanks, that makes sense.

I don’t know. I do find it weird that the Markdown docs explain Julia’s bespoke markdown dialect, but they don’t actually document anything about the API of the package. Like… there’s not even a reference to @md_str :face_with_raised_eyebrow:

This has been fixed in 1.11 by yours truly :sunglasses: If I can find the time, I’ll try to improve it further still, but no promises at this point.

I didn’t understand that last part tho. Do you have an example? I’m pretty sure HypertextLiteral was written so @htl_str behaved like @htl.

No, and this is by design. @htl allows nesting, i.e., recursive calls (see here). I guess I don’t need fully recursive parsing in Markdown, but it would be nice to control whether special characters are parsed or taken literal, i.e., the difference between

julia> using Markdown

julia> test = "foo\\\nbar"
"foo\\\nbar"

julia> md"$test"
"foo\\\nbar"

and

julia> md"$test"
  foo
  bar

Do you see what I mean?

Ahh, ok. I get it. I thought you were asking a completely different thing. Yeah, there’s a couple of differences between string literals and non-standard string literals.

Non-standard string literals don’t parse escape sequences (with some exceptions)

julia> function f(s)
          return (s, length(s))
      end;

julia> macro m_str(s)
          return (s, length(s))
      end;

julia> f("\n")
("\n", 1)

julia> m"\n" # not escaped
("\\n", 2)

julia> f("\"")
("\"", 1)

julia> m"\"" # double quotes are escaped
("\"", 1)

julia> julia> m"\\ \\"
("\\\\ \\", 4)

In the last example the first \\ is not an escape sequence, but the second one is. Otherwise we wouldn’t have a way to represent a backslash at the end of the string.

Non-standard string literals don’t interpolate. ever.

HypertextLiteral’s @htl_str appears to interpolate, but this is a hack. What the macro does is manually scan the string, find the $, split the string, try to parse the “interpolations” with Meta.parse, and concatenate the result.

This is also why it you cannot write nested strings. If you have m"$(", then that’s it. That’s the whole string! You cannot enable or disable this behavior. That would require modifying the julia parser at runtime.

The thing that made non-standard string literals click for me was seeing that @raw_str is in fact a no-op.


Going back to your example, since @md_str interpolations are fake, we’d need to know what function they’re using to interpolate the values. Maybe it’s intentional that interpolations cannot break lines or maybe it’s a bug.

edit: Thanks for updating the docs!