Hello
I’ve been experimenting with overloading getproperty
for some structs in a personal project of mine.
I noticed the following throws an error (using getfield
instead of getproperty
in this MWE).
Does anyone know why? I type the u-bar in the REPL by typing u\bar
then pressing tab.
struct MyStruct
ū::Vector{Float64}
end
tmp = MyStruct(rand(5))
tmp.ū # no error
getfield(tmp, :ū) # no error
getfield(tmp, Symbol("ū")) # throws error, type MyStruct has no field ū
So it seems that :ū == Symbol("ū")
is false
.
1 Like
This has to do with unicode normalization and might be a bug? Specifically,
julia> codeunits(String(:ū))
2-element Base.CodeUnits{UInt8, String}:
0xc5
0xab
julia> codeunits("ū")
3-element Base.CodeUnits{UInt8, String}:
0x75
0xcc
0x84
Julia normalizes symbols, but aparently doesn’t do so when you call Symbol(::String)
:
julia> codeunits(String(Symbol("ū")))
3-element Base.CodeUnits{UInt8, String}:
0x75
0xcc
0x84
3 Likes
That’s not a bug. Symbol(::String)
intentionally allows you to make a symbol out of any string as-is (as long as it does not contain '\0'
), and is intentionally not restricted to valid Julia identifiers. See the discussion in julia#5462 (at which point in time the constructor was called symbol(::String)
).
If you want to ensure a valid Julia identifier, do Meta.parse
or, better yet, use the :symbol
syntax.
That being said, I think there may be a bug in Symbol
printing stemming from a bug in Base.isidentifier
, which does not check normalization:
julia> "e\u0301" # e with acute accent, not NFC normalized
"é"
julia> Symbol("e\u0301") == :é # correct: :é is normalized
false
julia> Base.isidentifier(Symbol("e\u0301")) # incorrect: should check normalization
true
julia> Symbol("e\u0301") # incorrect display: should check normalization
:é
See Base.isidentifier(::Symbol) should check normalization · Issue #52641 · JuliaLang/julia · GitHub