Strange regex error (bug?)

julia> r"^\s*text"
r"^\s*text"

julia> Regex("^\s*text")
ERROR: syntax: invalid escape sequence

Julia 1.7.3 on Win10

And BTW what does the following error message mean?

julia> r"^[:space:]*text"
ERROR: LoadError: PCRE compilation error: POSIX named classes are supported only within a class at offset 1
julia> Regex("^\\s*text")
r"^\s*text"
1 Like

What @jling is saying is you need escaping, and note you could have done it this way r"^\\s*text"

but note also, instead of needing escaping (for regular strings) you can use the raw type, like raw"^\s*text"

That gives you a string, not a Regex, so you would still need Regex(raw"^\s*text") to make it so. That can help if you would otherwise need lots of escaping.

This got me thinking (since the r and raw strings are implemented by macros, r_str and raw_str), can you do macro composition (or here for string-types, so indirectly)?

r∘raw"^\s*text" doesn’t work (and maybe it shouldn’t…), nor does (r∘raw)"^\s*text". I’m ok with not having the option, since it’s likely very rare to do…

but it got me thinking, should this have been the default for Julia’s regexes, or at least also be provided?

A r_raw type of regex/string (or rather name it raw_r?) could be made. I’m a bit stumped how. The former is implemented like:

macro r_str(pattern, flags...) Regex(pattern, flags...) end

and raw by:

macro raw_str(s); s; end

Note, PCRE is the regex engine (from Perl) library used in Julia, so the error comes from it, since something wrong with your regex, seemingly this (since Julia is 1-based, should mean the same as “offset 0” there)::

I didn’t read carefully, but it seems your fix may be as simple as r"^[[:space:]]*text" in a sense needing a type of intentional “escaping”, if you will).

1 Like

Yes, that works :slightly_smiling_face:


julia> rg = r"^[[:space:]]*text"
r"^[[:space:]]*text"

julia> occursin(rg, "  text about")
true