Check Unicode character class

Hello, how can I check a Char’s Unicode character class? I need it for a lexer for a unicode-enabled language:
image
I tracked down that Char’s show method uses Unicode.category_abbrev, but trying to import that function gives me an error and there doesn’t seem to be documentation for it anywhere.

The Unicode module is not exported, so you need to specify Base.Unicode.category_abbrev.

3 Likes

Wow, that’s weird, it doesn’t work even if I explicitly import Unicode module, but does work when I also specify the Base part like you said. Is there a reason for this behaviour? Is module Unicode somehow different than the pre-imported Base.Unicode?

Unicode is just a submodule of Base. The reason the show method doesn’t specify it with Base.Unicode is because the code is all in the module Base, so it has access to all its submodules automatically. If you don’t want to specify Base.Unicode every time, you can also put using Base.Unicode at the top of your code.

Yes, I just thought that importing a module simply brings it into the scope, so I’m bewildered why does this happen:

You want import .Unicode.category_abbrev. Thers is an stdlib called Unicode as well.

2 Likes