Check for letters-only string

Orr_Orovan · June 9, 2020, 8:11pm

Python can check whether a string contain only letters with isalpha():

a_few = “abcdefghijk”
a_few.isalpha()
True
mixed = “abc45defg”
mixed.isalpha()
False

I have found Julia’s isletter() but it is limited to checking a single character.
How can Julia check for letters-only string?

nilshg · June 9, 2020, 8:31pm

You can apply isletter to all the characters in a string like so:

julia> all(isletter, test)
false

julia> a_few = "asfeaer"
"asfeaer"

julia> mixed = "adasd34asda"
"adasd34asda"

julia> all(isletter, a_few)
true

julia> all(isletter, mixed)
false

Orr_Orovan · June 9, 2020, 9:11pm

Thank you!!

Works well for what I was looking for.

tamasgal · June 9, 2020, 9:15pm

I am not sure how much you care about performance or unicode-ness, but this will check if the letters are a to z and it’s around 4x faster that the more mature version:

julia> text = "a" ^ 1_000_000;

julia> @btime all(isletter, $text)
  6.489 ms (0 allocations: 0 bytes)
true

julia> function isatoz(text)
           all(c -> 0x61 <= UInt8(c) <= 0x7A, text)
       end
isatoz (generic function with 1 method)

julia> @btime isatoz($text)
  1.774 ms (0 allocations: 0 bytes)
true

Orr_Orovan · June 9, 2020, 9:44pm

Very interesting.

In the particular case I was working on speed does not count but I like the transparency and preciseness of what you are suggesting here. I don’t know what is under the hood of isletter(). The function you define here is not leaving much room for doubt.

Thank you too!!

tamasgal · June 9, 2020, 10:00pm

You can try @edit isletter('a') in the Julia REPL, it should open up your default editor with the sources of that method. It’s very helpful to see what’s under the hood.

simeonschaub · June 9, 2020, 10:10pm

The difference is that isletter not only detects letters A-z, but unicode letters from other alphabets as well, for example accented letters, umlauts or greek letters… Naturally, this is quite a bit more complicated, but it’s important if you want to support languages other than English.

Orr_Orovan · June 9, 2020, 10:27pm

I checked what @tamasgal suggested —>> @edit isletter(‘a’) to see its structure. Did not know of course, that this is available.
I can see how isletter() will pick up letters that are not English as you mention here. This distinction makes it not proper for the application I was working on. I should have instead define a function like @tamasgal demonstrated because the user input it handles must be English letters and isletter() was used in a data validation step to verify that it is … an English letter.

Appreciate very much the valuable input from you all.
Thanks again!

simeonschaub · June 9, 2020, 10:43pm

Note that @tamasgal’s example only detects lowercase letters. To also detect uppercase letter, you also need to check against 0x41 through 0x5a. It’s probably also easier to understand if you write this as 'a' <= c <= 'z' || 'A' <= c <= 'Z', so you don’t convert toUInt8 first and have to remember all the ASCII codes, since you can just compare Chars directly.

Orr_Orovan · June 10, 2020, 2:04am

Excellent point. I passed lowercase() on the user input to avoid this problem but writing it like you do removes this concern. Using the actual characters instead of their corresponding codes is clearer to read.
If I had to choose ASCII code or character the next time I need it I would look for a drive or certain circumstances that prefer one over the other. I don’t know what it can be if any.

Topic		Replies	Views
Function to check if user input is between letters New to Julia	1	295	September 3, 2021
Python to julia New to Julia	7	699	August 6, 2020
How to create strings combining all letters? General Usage strings	6	1560	November 4, 2020
V0.6: isalpha (and similar) deprecation warning for arrays of (Sub)Strings General Usage question , proposal	2	866	March 12, 2017
Why doesn't islowercase work on String? General Usage strings	14	799	August 31, 2022

Check for letters-only string

Related topics