Lowercase and uppercase do not handle Mathematical Letters as expected

I have recently noticed that

lowercase('๐ด')
# '๐ด': Unicode U+1D434 (category Lu: Letter, uppercase)

and

uppercase('๐‘Ž')
# '๐‘Ž': Unicode U+1D44E (category Ll: Letter, lowercase)

do not behave as I would have expected, as ๐ด and ๐‘Ž have the same(-ish) relation as A and a, one being the capital form, the other the small form.
This seems to be the case for all Characters in the Unicode Block โ€œMathematical Alphanumeric Symbolsโ€ that have both lower-case and upper-case (Small and Capital) forms, as well as some Characters where the corresponding lower-case or upper-case form is in the Unicode block โ€œLetterlike Symbolsโ€ (for example โ€˜โ„›โ€™ and โ€˜๐“‡โ€™. )
Is this behaviour intended?

Yes, this is intentional. The name of that character in Unicode is MATHEMATICAL ITALIC CAPITAL A, and if you search the Case Folding Dataset, you will see that it isnโ€™t included.

To guess at the reasoning, a mathematical variable ๐ด and a variable ๐‘Ž always refer to different things, so transforming the former into the latter would be undesirable.

4 Likes

Thanks for the reply! Reasoning from the perspective of ๐ด and ๐‘Ž as mathematical variables makes a lot of sense.

1 Like