Sorting matrix according to the first column

In the matrix below, the first, second, and third columns represent the height, area, and volume of rectangular objects, respectively. I need to sort the items in non-ascending order of height (from the highest to the lowest). After sorting, the associated area and volume for each item should remain correctly aligned with its height.

4.27 122.62 102.83
2.18 178.34 214.79
29.58 273.83 840.17
18.99 89.68 683.06
10.77 269.75 1928.60

For example, considering this matrix, I need the following matrix sorted:

29.58 273.83 840.17
18.99 89.68 683.06
10.77 269.75 1928.60
4.27 122.62 102.83
2.18 178.34 214.79

How should I proceed?

Try:

julia> sortslices(A, dims=1, rev=true)
5Γ—3 Matrix{Float64}:
 29.58  273.83   840.17
 18.99   89.68   683.06
 10.77  269.75  1928.6
  4.27  122.62   102.83
  2.18  178.34   214.79
4 Likes

Another option:

julia> @view A[sortperm(A[:,1]; rev=true),:]
5Γ—3 view(::Matrix{Float64}, [1, 2, 3, 4, 5], :) with eltype Float64:
 29.58  273.83   840.17
 18.99   89.68   683.06
 10.77  269.75  1928.6
  4.27  122.62   102.83
  2.18  178.34   214.79

This doesn’t materialize the sorted matrix which may be more efficient for the correct use case.

1 Like

for this small matrix the first expression seems more efficient (but perhaps for larger matrices the others work better)

julia> using BenchmarkTools

julia> @btime stack(sort(eachrow(A),by=first, rev=true),dims=1)
  375.879 ns (3 allocations: 464 bytes)
5Γ—3 Matrix{Float64}:
 29.58  273.83   840.17
 18.99   89.68   683.06
 10.77  269.75  1928.6
  4.27  122.62   102.83
  2.18  178.34   214.79

julia> @btime sortslices(A, dims=1, rev=true)
  926.667 ns (12 allocations: 1008 bytes)
5Γ—3 Matrix{Float64}:
 29.58  273.83   840.17
 18.99   89.68   683.06
 10.77  269.75  1928.6
  4.27  122.62   102.83
  2.18  178.34   214.79

julia> @btime @view A[sortperm(A[:,1]; rev=true),:]
  576.023 ns (5 allocations: 272 bytes)
5Γ—3 view(::Matrix{Float64}, [3, 4, 5, 1, 2], :) with eltype Float64:
 29.58  273.83   840.17
 18.99   89.68   683.06
 10.77  269.75  1928.6
  4.27  122.62   102.83
  2.18  178.34   214.79

Oooh, I forgot another @view:

@view A[sortperm(@view A[:,1]; rev=true),:]

(maybe this is better)

Cleaner to use @views A[sortperm(A[:,1]; rev=true),:]

@Dan when you state that this does not materialize, could you explain it a bit more?

Does this mean that instead of sorting an array/matrix/whatever, one can get a view of the element in a sorted order?

And if I update elements of this view, then it should propagate correctly to the initial element?

Kind regards

Seems you got it right:

Yes

Yes

β€œdoes not materialize” means does not have to allocate memory and set it to these values, but the required values are calculated on the CPU from the inputs as needed. Basically, this is the advantage of views, which just tell the CPU where to find the original array and how to calculate a modified index into the original array.

BTW, on my machine, adding the missing @view, using stevengj’s style, benchmarks more than 5x the benchmark without the second @view:

julia> @btime @view $A[sortperm($A[:,1]; rev=true),:]
  230.680 ns (2 allocations: 192 bytes)
5Γ—3 view(::Matrix{Float64}, [1, 2, 3, 4, 5], :) with eltype Float64:
 29.58  273.83   840.17
:
julia> @btime @views $A[sortperm($A[:,1]; rev=true),:]
  33.598 ns (1 allocation: 96 bytes)
5Γ—3 view(::Matrix{Float64}, [1, 2, 3, 4, 5], :) with eltype Float64:
 29.58  273.83   840.17
:

33 ns vs. 230 ns

2 Likes

Just note that this is not free. Each array access now requires an additional load. If you plan to access the sorted array a great number of times, then this additional cost might add up and you could come out slower.

julia> using BenchmarkTools; v = rand(1000); inds = rand(1:1000, 10000);
julia> @btime let s = 0, v = view($v, sortperm($v))
        for i in $inds
            s+= v[i]
        end
        s
    end

  72.009 ΞΌs (2 allocations: 15.88 KiB)
4971.880280956974

julia> @btime let s = 0, v = sort($v)
        for i in $inds
            s+= v[i]
        end
        s
    end

  58.389 ΞΌs (3 allocations: 18.06 KiB)
4971.880280956974

Interestingly enough if I directly interpolate the array/view then there is no runtime difference:

julia> @btime let s = 0
       for i in $inds
           s+= $(sort(v))[i]
       end
       s
       end
  39.392 ΞΌs (0 allocations: 0 bytes)
4971.880280956974

julia> @btime let s = 0
       for i in $inds
           s+= $(view(v,sortperm(v)))[i]
       end
       s
       end

  39.462 ΞΌs (0 allocations: 0 bytes)
4971.880280956974
1 Like

Linking similar thread.

in fact it seemed strange that it could be an β€œoriginal” topic :grin:

I tried various combinations of the parameters of the various expressions.
This seems to be the most efficient

@btime @views $A[sortperm($A[:,1]; lt=(x,y)->y<x),:]

Dear colleagues, thank you very much for your support. Bruno

If what is really needed is the operation on matrices, others have already provided their solutions. But I can’t resist to point out that if your matrix represents data with this column interpretation, you may want to consider using some higher-level data type for your data. Namely, DataFrame from DataFrames.jl package. Have a look how convenient and systematic this sorting of your table is:

julia> using DataFrames

julia> data = [4.27 122.62 102.83;
               2.18 178.34 214.79;
               29.58 273.83 840.17;
               18.99 89.68 683.06;
               10.77 269.75 1928.60]
5Γ—3 Matrix{Float64}:
  4.27  122.62   102.83
  2.18  178.34   214.79
 29.58  273.83   840.17
 18.99   89.68   683.06
 10.77  269.75  1928.6

julia> df = DataFrame(height = data[:, 1], area = data[:, 2], volume = data[:, 3])
5Γ—3 DataFrame
 Row β”‚ height   area     volume  
     β”‚ Float64  Float64  Float64 
─────┼───────────────────────────
   1 β”‚    4.27   122.62   102.83
   2 β”‚    2.18   178.34   214.79
   3 β”‚   29.58   273.83   840.17
   4 β”‚   18.99    89.68   683.06
   5 β”‚   10.77   269.75  1928.6

julia> println(df)
5Γ—3 DataFrame
 Row β”‚ height   area     volume  
     β”‚ Float64  Float64  Float64 
─────┼───────────────────────────
   1 β”‚    4.27   122.62   102.83
   2 β”‚    2.18   178.34   214.79
   3 β”‚   29.58   273.83   840.17
   4 β”‚   18.99    89.68   683.06
   5 β”‚   10.77   269.75  1928.6

julia> sort!(df, :height, rev=true)
5Γ—3 DataFrame
 Row β”‚ height   area     volume  
     β”‚ Float64  Float64  Float64 
─────┼───────────────────────────
   1 β”‚   29.58   273.83   840.17
   2 β”‚   18.99    89.68   683.06
   3 β”‚   10.77   269.75  1928.6
   4 β”‚    4.27   122.62   102.83
   5 β”‚    2.18   178.34   214.79

julia> println(df)
5Γ—3 DataFrame
 Row β”‚ height   area     volume  
     β”‚ Float64  Float64  Float64 
─────┼───────────────────────────
   1 β”‚   29.58   273.83   840.17
   2 β”‚   18.99    89.68   683.06
   3 β”‚   10.77   269.75  1928.6
   4 β”‚    4.27   122.62   102.83
   5 β”‚    2.18   178.34   214.79
5 Likes