Consideirng a DataFrame
using DataFrames
using Random: randstring
M = 100_000_000
str_base = [randstring(8) for i in 1:1_000_000]
df = DataFrame(int = rand(Int32, M), float=rand(M), str = rand(str_base, M))
@time sort!(df, :int); 
# 80s on my machine
@time sort!(df, :str); 
# 170son my machine
using CSV
CSV.write("tmp.csv", df)
The same operation using R’s data.table is like 3s
library(data.table)
df = fread("tmp.csv")
setkey(df, "int") 
# 3s 
setkey(df, "str") 
# 25s 
So based on this the performance of data.table is still much better.
Now the sort! algorithm is really simple which I can replicate here
]add https://github.com/xiaodaigh/SortingLab.jl
using SortingLab
using Base.Threads: @spawn
function another_sort!(df, col)
    @time ordering = fsortperm(df[!, col])
    channel_lock = Channel{Bool}(length(names(df)))
    for c in names(df)
        @spawn begin
            v = df[!, c]
            @inbounds v = v[ordering]
            put!(channel_lock, true)
        end
    end
    for _ in names(df)
        take!(channel_lock)
    end
    df
end
@time another_sort!(df, :int); # sortperm is 10s total 12s~18s
@time another_sort!(df, :str); # sortperm is 10s total 12s~18s
You can see that (f)sortperm takes 10s. So using a more optimise sortperm like SortingLab.fsortperm can get much better results already.
The solution seems to be about finding a more efficient sortperm. For a start, perhaps adapting SortingLab.fsortperm would be a good start.