StringIndex idea (Julia 2.0)

I disagree with your premise that random codepoint indexing is “working with strings properly”. String algorithms that involve random access should be operating on codeunits (= bytes in UTF-8), not codepoints (Chars), and codeunits already provide O(1) access.

So far, in all the discussions of strings in Julia, to my recollection there has not been a single example of a practical string algorithm that requires random-access codepoint indexing (as opposed to a “pointer” to a previously traversed location, as in a search result).

The discussions on this thread about making the current string (codeunit) indices opaque and adding other index types have mostly been about making things more intuitive for new users, and preventing bugs due to mis-use of s[i+1] instead of s[nextind(s, i)] and similar. Not about an algorithmic need for fast random character indexing.

(AFAIK, no mainstream programming language other than Python3 tries to guarantee O(1) random codepoint access for its default string type, and Python3’s idiosyncratic approach comes with a lot of disadvantages.)

4 Likes