String indices : byte indexing feels wrong

This has been considered carefully and discussed many times. See e.g. Substring function? - #27 by stevengj

For a more extensive discussion, see e.g. http://utf8everywhere.org/ … there are strong reasons to prefer the variable-width UTF-8 encoding for general-purpose string handling, and many modern languages have made the same choice (e.g. Swift, Go). It trades off efficiency on an operation you hardly ever need (“give me the n-th character”) for many other advantages.

But it is jarring in the first few days of using strings — the experience for new users is the most unfortunate tradeoff of a variable-width encoding! If there is a particular practical operation you are not sure how to do on Julia strings (e.g. looping over strings, searching strings, extracting slices of strings based on patterns, etc. are all easy once you get used to it), please feel free to ask (after searching the manual/web).

19 Likes