Sorting strings containing numbers, so that "A2" < "A10"?

“k(x)” in the statement “A = k(x)” means apply function k to the value of parameter x. You supply this value when you call the function natural(x,y).

“k(x)” in “k(x) = [occursin …” means “define the function k as k(x)=[occursin …”

When you call "natural(“A2”, “A10”), the call A=k(x) results in A=[“A”, 2] and the call B=“A10” results in B=[“A”, 10]. Subsequently the string “A” is compared to the string “A” and the integer 2 is compared to the integer 10.

1 Like

Thanks…And the return within for loop and last line return length(A) < length(B). What would these be generating?

The length of zip(A, B) is the shorter of the lengths of A and B. If all the elements in the shorter array match the corresponding elements in the longer array, the for loop ends without returning anything. At that point the statement “return length(A) < length(B)” is executed. It returns true (that is x is less than y) if A has fewer elements than B.

1 Like

Note that natural will ignore spaces. I suppose that may be a feature, though.

1 Like

Note that NaturalSort.jl was moved to JuliaStrings and was updated to work with Julia 1.x.

So, I would tend to recommend the NaturalSort package for people who want this kind of ordering.

5 Likes

The code is great, but you might want to change sort to sort!, in order to get l sorted.

@pfitzeb, both matchall and isnumber seem to have been deprecated. Would the following update be correct for Julia 1.5.2?

function sort_pfi(x::Vector{String})
    f = text -> all(isnumeric, text) ? Char(parse(Int, text)) : text
    sorter = key -> join(f(c) for c in collect(m.match for m in eachmatch(r"[0-9]+|[^0-9]+", key)))
    sort(x, by=sorter)
end

Thank you.

That works, although I’d get rid of the collect:

function sort_pfi(x::Vector{String})
    f = text -> all(isnumeric, text) ? Char(parse(Int, text)) : text
    sorter = key -> join(f(m.match) for m in eachmatch(r"[0-9]+|[^0-9]+", key))
    sort(x, by=sorter)
end
2 Likes

Don’t use the below, NaturalSort.jl is much better.

This is a more robust implementation (and works with bigger numbers):
function natural_sort!(x::Vector{AbstractString})
    function string_or_number(text)
        if all(isdigit, text)
            val = tryparse(Int, text)
            if val === nothing
                return something(tryparse(BigInt, text), text)
            end
        end
        return text
    end

    function lt(x, y)
        for (a, b) in zip(x, y)
            if !isequal(a, b)
                return lt(a, b)
            end
        end
        return length(x) == length(y)
    end
    lt(x::T, y::T) where T <: Union{Number, AbstractString} = x < y
    lt(x::Number, y::AbstractString) = true
    lt(x::AbstractString, y::Number) = false

    sorter = key -> (string_or_number(m.match) for m in eachmatch(r"[0-9]+|[^0-9]+", key))
    sort!(x, by = sorter, lt = lt)
end
5 Likes