Sorting names. Surprise

What I’m I missing here?
This is wrong.

julia> sort(["Vila Flor", "Vila da Flor"])
2-element Vector{String}:
 "Vila Flor"
 "Vila da Flor"

correct output should be

 "Vila da Flor"
 "Vila Flor"

What makes you say this is wrong?

1 Like

Capitalization does matter here, since:

julia> 'F' < 'd'

Note, you can use the by keyword to give your own custom preprocessing before each argument is compared.


Thanks, that was it. Had tried with Windows name sorting that gave me what I was expecting but possibly because on Win names are case insensitive.

julia> sort(["Vila Flor", "Vila da Flor"], by=uppercase)
2-element Vector{String}:
 "Vila da Flor"
 "Vila Flor"
1 Like

Note that simply doing converting to uppercase might possibly not sort correctly outside of ASCII (correct uppercasing is itself language-dependent). If you need proper locale-dependent sorting (“collation”), I think you can find it in the StrICU package.

Right, but these are Portuguese names so all inside ASCII. And I checked that now the sorting does what is expected when comparing the names to those of another file that are previously sorted.

It might be worth keeping in mind that:

julia> sort(["a", "ã", "â", "c", "ç", "e", "ê", "o", "õ", "z"], by=uppercase)
10-element Array{String,1}: