Is there a way to get the “\dot” unicode character to be replaced?
julia> replace("ẋ′", "ẋ"=>"dx", "′"=>"_p")
"ẋ_p"
Is there a way to get the “\dot” unicode character to be replaced?
julia> replace("ẋ′", "ẋ"=>"dx", "′"=>"_p")
"ẋ_p"
I can’t reproduce your example:
julia> replace("ẋ′", "ẋ"=>"dx", "′"=>"_p")
"dx_p"
However, I have a good guess for what happened on your computer.
I’m guessing that you are having problems due to differences in Unicode normalization. The difficulty is that there are two “canonically equivalent” ways to express the "ẋ"
that consist of different sequences of characters. You can can use a single character U+1E8B 'ẋ'
, or you can use an ordinary ASCII 'x'
followed by U+0307 “combining dot above”:
julia> import Unicode
julia> s1 = Unicode.normalize("ẋ", :NFC) # NFC normalization gives the 1-char version
"ẋ"
julia> s2 = Unicode.normalize("ẋ", :NFD) # NFD normalization gives the 2-char version
"ẋ"
julia> s1 == s2
false
julia> collect(s1)
1-element Vector{Char}:
'ẋ': Unicode U+1E8B (category Ll: Letter, lowercase)
julia> collect(s2)
2-element Vector{Char}:
'x': ASCII/Unicode U+0078 (category Ll: Letter, lowercase)
'̇': Unicode U+0307 (category Mn: Mark, nonspacing)
Probably you are using the NFC version in one place and an NFD version in another. Either be consistent in how you enter "ẋ"
or explicitly call Unicode.normalize
before doing the replace
call.
I’m guessing that the reason your example worked for me is that, as @mbauman commented in another thread, some browsers automatically normalize Unicode when you paste into their text-entry box to post on discourse.
PS. Note that this is not in any way specific to Julia. The same issue of multiple representations for the “same” string appears in any language supporting Unicode text.
Thanks! I knew I would learn something new with this question