Why doesn't islowercase work on String?

Why doesn’t islowercase work on String? It only works on characters.

Hmm that is a little unexpected. I guess its because a String could have a mix of upper and lowercase characters so you have to check all the characters like this:

julia> all(x->islowercase(x), "hello")
true
2 Likes

conceptually it’s like why does iseven only work on number not array,

you can simply write

all(islowercase, "hello")
9 Likes

oops, I’m used to writing more complicated conditionals that I default to anonymous functions, good catch.

Let’s say the string is "Hello". Most of that string is in lowercase, but one letter isn’t. So there’s no binary answer to the question islowercase("Hello").

Intuitively, what we’re thinking of when we ask islowercase("Hello") is is_all_lowercase("Hello"). Since all already exists, it makes sense to do all(islowercase, "Hello") instead, since it composes existing functionality and isn’t any more complicated or longer that a special function would be.

2 Likes

I think lowercase working on String would be reasonable, and that its meaning is clear. Not least because the function lowercase does work on strings:

julia> lowercase("HelLo")
"hello"

It is a bit odd that the above works, but islowercase(lowercase("HelLo")) errors.

7 Likes

lowercase("HelLo") is not ambiguous, while islowercase("HelLo") is a little. It would be reasonable to expect it to return something like (false,true,true,false,true).

Once you’ve seen it, the following syntax is quite natural:

julia> all(islowercase, "Hello")
false

julia> any(islowercase, "Hello")
true
1 Like

I disagree, that would be very surprising, and something one might expect from islowercase.(). The issomething functions always return a scalar Bool.

IMO, lowercase(str) and islowercase(str) seem like a natural pair, while all(islowercase, str) seems to be more naturally paired with map(lowercase, str), in that both functions would need mapping over a string.

It also corresponds to how we talk about it in language, there is no ambiguity about the concepts ‘a lowercase string’, ‘an uppercase string’, and ‘a mixed-case string’.

4 Likes

Issue here

Would such a definition have any contraindications?

julia> islowercase("Hello")
ERROR: MethodError: no method matching islowercase(::String)
Closest candidates are:
  islowercase(::AbstractChar) at strings/unicode.jl:324
Stacktrace:
 [1] top-level scope
   @ c:\Users\sprmn\.julia\v1.8\string2.jl:19

julia> import Unicode.islowercase

julia> islowercase(s::String) = s==lowercase(s)
islowercase (generic function with 2 methods)

julia> islowercase("Hello")
false

That issue is now closed.

I disagree with what feel natural, so I am gonna fork Julia, rebrand it Julie and implement lowercase(::String) :grin:

2 Likes

What definition do you want? Consider:

julia> all(islowercase, "élan")
false

julia> all(islowercase, "élan")
true

(Hint: run collect on these two strings.)

Consider:

julia> Base.islowercase(s::AbstractString) = s==lowercase(s)

julia> islowercase("1")
true

julia> islowercase('1')
false

which seems inconsistent.

3 Likes

I see.
But this inconsistency seems to derive (also) from the fact that the following functions that are used in the definition of lowercase() and islowercase(), give these results:

julia> c2l=Char(ccall(:utf8proc_tolower, UInt32, (UInt32,), '1'))
'1': ASCII/Unicode U+0031 (category Nd: Number, decimal digit)

julia> Bool(ccall(:utf8proc_islower, Cint, (UInt32,), UInt32(c2l)))
false

If I don’t make logical mistakes: we have a function that transforms a character into its lowercase form and a function that checking this result says it is not lowercase.

I understood, reading here and there that these topics depend on many “variables” that it is not easy to keep together in a simple way.

As far as I can tell, every widely available string library in every mainstream language does this: there is a lowercase-like function which converts characters to lowercase if possible (and otherwise leaves them alone), and an islower-like function that checks specificaly whether a character is a lowercase letter.

(The islower predicate stems originally from the function of the same name in the C standard library.)

For example, in Python 3 (which doesn’t have a distinction between string and character types):

>>> "1".lower()
'1'

>>> "1".islower()
False

See also the Ruby downcase — Ruby doesn’t provide an islower predicate, and instead the standard recommendation seems to be to write a regex. Or Swift’s string.lowercased() method and char.isLowercased property. Or the C# String.ToLower() and Char.IsLower methods. Or the Go ToLower(str) and IsLower(char) functions. Or …

4 Likes

Thanks for taking the time to clear up all of these things.
I had no doubt that the choice was unfounded, even without the fact that all other languages have made the same choice for these two functions.
I still have some curiosities about why.
To simplify, let’s imagine that at some point, in the definition of the lowercase () function, it was decided, for many good reasons, that for characters that do not have the upper-lower correspondent to leave them as they are rather than raise an error or other alternative.
But leaving them as they are could be done with two opposite choices: one that interprets the characters as both upper and lower (this would not have the out of tune I pointed out, but who knows how many other problems it would bring with it); the other (the one actually taken) to consider these characters neither upper nor lower.