Escaped sequences in `SubstitutionString`s

Alex_Tantos · August 24, 2022, 9:38am

Hi!

I tried to create a substiution string with Base.SubstistutionString() that contains an escaped sequence (\s), but, as expected, it does not work, since the relevant section in the docu says that:

SubstitutionString(substr)

Stores the given string `substr` as a `SubstitutionString` , for use in regular expression substitutions.

So, the question is whether anyone knows how to use an escaped sequence (e.g. a space character) in a substitution string that would replace another escaped sequence (e.g. a newline one).

Here is an example to use:

julia> str ="Whether an array is ordered can be defined either on construction via the ordered argument, or at any time via the ordered! function. The levels function returns 
all the levels of CategoricalArray, and the levels! function can be used to set the levels and their order. "

julia> str = replace(str, r"\n" => Base.SubstitutionString("\s"))
ERROR: syntax: invalid escape sequence
Stacktrace:
 [1] top-level scope
   @ none:1

Thanks!

Alex

ludwig-austermann · August 24, 2022, 10:46am

I don’t think \s is a valid escape character at all, which should be the problem in your example. If you want a space character use the string " " or better even ' '.

Note that your example can be simplified to replace(str, '\n' => ' ', as no regex is used.

Alex_Tantos · August 24, 2022, 11:41am

Thanks for the response and the option of not even using Base.SubstitutionString() in the first place.

Also, \s can be used as a valid escaped sequence in a regular expression, in general, and in the exact same use of replace():

julia> replace(str, r"\s" => '|')
"Whether|an|array|is|ordered|can|be|defined|either|on|construction|via|the|ordered|argument,|or|at|any|time|via|the|ordered!|function.|The|levels|function|returns||all|the|levels|of|CategoricalArray,|and|the|levels!|function|can|be|used|to|set|the|levels|and|their|order.|"

I guess then that the regular expressions type of escaped sequences differ from the ones on the right.

Alex

digital_carver · August 24, 2022, 11:52am

Yeah, the one on the left is a Regex type that can have all the usual regular expression escape sequences. \s on the right wouldn’t make sense because it means whitespace in general, not a specific character (for eg. Tab also matches \s). So if it appears in the SubstitutionString, it’s ambiguous which character you actually want there (space or tab).

The only escape sequences in the substitution string that make sense and are interpreted, are numbered ones like \1, \2, etc. (and the corresponding named \g<name> captures) that are used to refer to parts of the regex that have been captured on the left.

Alex_Tantos · August 24, 2022, 11:58am

That clears all the fog! Thanks!

Topic		Replies	Views
New line in substitution string? General Usage	7	10700	April 30, 2020
Regular Expressions - Can't Insert \t in Replacement String General Usage	9	1175	December 18, 2017
Bug of replace? General Usage question	2	717	April 5, 2017
How to replace backslash New to Julia regex	2	163	June 12, 2025
Regex escape chars General Usage strings , regex	3	2072	March 27, 2019

Escaped sequences in `SubstitutionString`s

Related topics