Replacing strings in specific position indices as in `str_sub()` in `stringr`

I’m a bit skeptical that this is what you would want in a realistic application. How is the data being generated that you know specific graphemes indices to replace?

Most commonly, string indices are obtained by searching/iterating a string, in which case you want the actual index and not a character or grapheme index, and this is what string slicing and substrings already do. See also my comment in another thread Substring function? - #27 by stevengj

The basic argument is that a variable-width encoding with non-consecutive codepoint indices is a good tradeoff to make (memory efficiency + speed, at the cost of less-intuitive indexing) because “give me the m-th codepoint” or “or give me the substring from codepoints m to n” is extremely uncommon in (correct) string-handling code, as opposed to “give me the substring at opaque indices I found in a previous search/loop”.

and the surrounding discussion.

“Give me the m:n-th user-perceived ‘characters’ (i.e. graphemes)” is something that people commonly ask for in their first steps of using strings in Julia, which is why I added a graphemes(s, m:n) function, but when you go farther you typically find that this is not what is needed at all.

1 Like