Non-standard string literals create global-like objects!?

I am very confused by the way non-standard string literals work when called inside of other functions. See this example with regex (but it happens with any non-standard string literal):

r"a" === r"a" # false
Regex("a") === Regex("a") # false
r"a" == Regex("a") # true, because it is ==, not ===

f() = r"a" 
g() = Regex("a")

f() === f() # true !?
g() === g() # false

Why does it matter whether I type r"a" or Regex("a") ? What explains why r"a" === r"a" is false while f() === f() is true? I would have expected === to return false for all these examples.

In the former case (string macro), the regex compilation is done once at compile time and cached, whereas in the latter case it is repeated at runtime on every call.

3 Likes

I was using the regex just as a convenient built-in example. I see how this behavior is valuable for regex and similar situations, but that is causing problems for my use case. Can one make a non-standard string literal that does not perform such caching? After all, these are just macros and I expected them to behave more like:

macro H() :(Regex("a")) end
h() = @H
h() === h() # false

I imagine, plenty of people think of non-standard string literals as a convenient way to make a small DSL, a way to convert something represented as a string into a more convenient datastructure, and not as a tool for caching. Is that simply a wrong way to view them? Even looking at the documentation seems to describe something more along the lines of my usecase, not caching-focused: Strings · The Julia Language

There are situations when you want to construct a string or use string semantics, but the behavior of the standard string construct is not quite what is needed. For these kinds of situations, Julia provides non-standard string literals.

Don’t think this is specific to string literals.

macro m1()
    :(Regex("abc"))
end

# creates new regex every time
f1() = @m1()

macro m2()
    Regex("abc")
end

# creates regex only once
f2() = @m2()
4 Likes

Thank you! This explains my confusion perfectly. At the root of it was not recognizing that a macro’s return value is treated differently depending on whether it is an Expr or anything else (I was not really appreciating that a macro might be written to return a non-Expr).

And it’s not just the return value — an object such as Regex("abc") can be put anywhere in the expression. :( occursin($(Regex("abc")), $expr) ) would create the regex once, and put it into the expression.

2 Likes

This isn’t unique to non-expression returns. It just depends on if you defer the evaluation or do it inside the macro. E.g you could write

macro m3()
    s = "abc"
    quote
        r1 = $(Regex(s))
        r2 = Regex($s)
    end
end

r1 is constructed during macroexpansion whereas r2 is constructed at runtime

3 Likes