Sort matrix based on the elements of a specific column

Hi
I’m new to Julia but I have some exprience with Matlab.
I have an array A (mix of numbers and text columns) and I want to sort the rows based on the values on a specific column of A.
In matlab, the code is:
B = sortrows(A,column)
This sorts A based on the columns specified in the vector column. For example, sortrows(A,4) sorts the rows of A in ascending order based on the elements in the fourth column. sortrows(A,[4 6]) first sorts the rows of A based on the elements in the fourth column, then based on the elements in the sixth column to break ties.

Is there a similar command in Julia? I’ve been looking but not yet found the answer

Thanks in advance for the help!

3 Likes
A = rand(1:100, 3, 4)   # a random matrix
A[sortperm(A[:, 4]), :] # sorted by the 4th column

Welcome to Julia!

14 Likes

Here is another:

sortrows(A, i, rev=false) = sortslices(A, dims=1, lt=(x,y)->isless(x[i],y[i]), rev=rev)
6 Likes

If your array has a mix of numeric and text columns, you probably want to be using a DataFrame instead. The DataFrames package comes with methods for sorting by column name or index:

julia> using DataFrames
julia> df = DataFrame(a = ["b", "a", "a"], x = [4.2, 0, -5.1], 
           y = [3//2, -1//2, 15//16]);
julia> sort(df, :a)
3Γ—3 DataFrame
β”‚ Row β”‚ a      β”‚ x       β”‚ y         β”‚
β”‚     β”‚ String β”‚ Float64 β”‚ Rational… β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ a      β”‚ 0.0     β”‚ -1//2     β”‚
β”‚ 2   β”‚ a      β”‚ -5.1    β”‚ 15//16    β”‚
β”‚ 3   β”‚ b      β”‚ 4.2     β”‚ 3//2      β”‚

julia> sort(df, 1)
3Γ—3 DataFrame
β”‚ Row β”‚ a      β”‚ x       β”‚ y         β”‚
β”‚     β”‚ String β”‚ Float64 β”‚ Rational… β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ a      β”‚ 0.0     β”‚ -1//2     β”‚
β”‚ 2   β”‚ a      β”‚ -5.1    β”‚ 15//16    β”‚
β”‚ 3   β”‚ b      β”‚ 4.2     β”‚ 3//2      β”‚

or by multiple columns:

julia> sort(df, [:a, :x])
3Γ—3 DataFrame
β”‚ Row β”‚ a      β”‚ x       β”‚ y         β”‚
β”‚     β”‚ String β”‚ Float64 β”‚ Rational… β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ a      β”‚ -5.1    β”‚ 15//16    β”‚
β”‚ 2   β”‚ a      β”‚ 0.0     β”‚ -1//2     β”‚
β”‚ 3   β”‚ b      β”‚ 4.2     β”‚ 3//2      β”‚
3 Likes

I had the same issue, but I needed to break ties and didn’t want to use DataFrames. Here is a suggestion (the default behavior is like Matlab’s, otherwise one can specify a matrix of elements to sort on):

function sortrows(M, by=zeros(0))
    if by == zeros(0)
        order = copy(M)
    else
        order = copy(by)
    end
    if size(order,2) > 1
        order = Float64.(order.-minimum(order, dims = 1))
        order = (order./maximum(order,dims=1))*(10).^(size(order,2):-1:1)
    end
    order = sortperm(order[:,1])
    return M[order,:], order
end

A = [4 1 1; 3 2 5;2 4 3]'
sorted_A, order = sortrows(A,A[:,[1,3]])

julia> sorted_A
3Γ—3 Array{Int64,2}:
 1  5  3
 1  2  4
 4  3  2

In Julia 1.4.2:

 # generate random matrix
A = rand(4,3)

 # sort by row (dim=1) along column 2
sortslices(A,dims=1,by=x->x[2],rev=false)

 # reverse sort by row (dim=1) along column 2
sortslices(A,dims=1,by=x->x[2],rev=true)
9 Likes

Out of curiosity, it’s been a few years – do you all suppose this is still the most straightforward approach?

Perhaps we should season the solution with views:

@views A[sortperm(A[:, 4]), :]
2 Likes

An arguably cleaner version, and easier to generalize:

B = stack(sort(eachrow(A), by=r -> r[4]); dims=1)

Optics are also nice here:

using AccessorsExtra
B = @modify(eachrow(A)) do rows
   sort(rows, by=r -> r[4])
end
1 Like