How to sort two or more lists at once?

Let’s say you have:

cur_x = shuffle(1:10)
cur_y = rand(10)

And you want to sort cur_x in the usual fashion, but drag cur_y along for the ride.

How do you – in one line – zip(cur_x, cur_y) and sort! by cur_x?

// so in the end it looks like:

cur_x = [ 1 2 3 4 5 6 7 8 9 10 ]

p = sortperm(cur_x); cur_x .= cur_x[p]; cur_y .= cur_y[p] ?

2 Likes

Seems like a pretty common operation.

Is there anything more succinct than that?

Does

sort(collect(zip(cur_x, cur_y)); by=first)

get you closer to where you want to be?

3 Likes

I definitely like that, but it’s already ~40 characters? // and not done in place

How would you succinctly unzip it for assignment then?


edit: I guess something like this works:

cur_x, cur_y = map(collect,zip(sort(collect(zip(cur_x, cur_y)); by=first)...))

but that’s ~80 characters and I’d probably just go with @stillyslalom’s answer then?

If you want in-place, definitely go with @stillyslalom’s answer.

Also, I am not sure that being more “succinct” than this is a good goal (even if it was feasible), but I generally prize clarity more than brevity. Just now I was upgrading some package of mine for v0.7, and found that one of the unit tests was broken, but frankly I spent an hour figuring out what it does… and I wrote it 5 months ago.

3 Likes

A variation of the sortperm approach is

cur_x, cur_y = getindex.((cur_x, cur_y), (sortperm(cur_x),))

This could potentially be more succinct with dotted indexing (see e.g. https://github.com/JuliaLang/julia/issues/22858) but it has its complications.

This solution is not in place but on the other hand cur_x .= cur_x[p] computes a temporary on the right hand side, which is then discarded, so unless it’s important not to replace the original arrays there’s no big difference.

“in-place” is also more disaster-prone.

1 Like

This. Looking back at Perl, it was meant to read like English (sorry, I know one should not be English-centric). However look what happens to Perl when people put in shall we say ‘ingenious’ constructs to serve brevity.
One day an Obfuscated Julia contest may be fun, but I honestly think Obfuscated Perl contests did the language no favours.

Lets also consider Python. One of the joys I find with Python is that you can understand the program flow even if you are not familiar with the language.
So a small plea from me. Do not try to compress the character count down to the point of losing the meaning.
Bits are cheap these days. Remember “Premature optimization is the root of all evil” and I think this counts for the amount of space program code and also data structures consume.

2 Likes

Thanks for the suggestions! Made a pair of functions from these called: sort_lists! and shuffle_lists!:

function sort_lists!(main_list, other_lists...; cur_func::Function=identity)
    cur_indices = sortperm(main_list, by=cur_func)

    cur_list = main_list[cur_indices]
    _arrange_lists!(main_list, other_lists, cur_indices)
    cur_list
end
function shuffle_lists!(main_list, other_lists...)
    cur_indices = shuffle(1:length(main_list))

    cur_list = main_list[cur_indices]
    _arrange_lists!(main_list, other_lists, cur_indices)
    cur_list
end
function _arrange_lists!(main_list, other_lists, cur_indices)
    cur_lists = [ collect(main_list) ]
    isempty(other_lists) || append!(cur_lists, other_lists)
    
    for cur_list in cur_lists
        cur_list .= cur_list[cur_indices]
    end   
end

Quickly testing them out:

using Plots
a = linspace(-1,1,11)

b = (x -> x).(a)
c = (x -> x^2).(a)
d = (x -> x^3).(a)

p1 = plot(a,[b,c,d], legend=false)

a = shuffle_lists!(a,b,c,d)
p2 = plot(a,[b,c,d], legend=false)

a = sort_lists!(a,b,c,d)
p3 = plot(a,[b,c,d], legend=false)

plot(p1,p2,p3,layout=@layout([p1 p2 p3]))

list_sorter

For this case, you probably want to put things in a table:

using IndexedTables
s = table(rand(10), rand(10), rand(10))
sort(s, 1)
sort!(s, 2)

Extra benefit: you can sort by more than one column at ones (useful if there are repeated values).

1 Like

At this point you could as well put the two vectors in a data frame and sort its rows.

This is a bit old, but let me just point out that Julia beautifully allows one to solve this problem by defining a custom type that holds a number of arrays, and teaching it how to sort itself. Then sort! can sort using the values in the first array, but updating the order for the second array too. This allows a solution that is faster, and with less allocations, than the sortperm approach. But of course it requires quite a bit more code

struct CoSorterElement{T1,T2}
    x::T1
    y::T2
end
struct CoSorter{T1,T2,S<:AbstractArray{T1},C<:AbstractArray{T2}} <: AbstractVector{CoSorterElement{T1,T2}}
    sortarray::S
    coarray::C
end

Base.size(c::CoSorter) = size(c.sortarray)
Base.getindex(c::CoSorter, i...) = 
    CoSorterElement(getindex(c.sortarray, i...), getindex(c.coarray, i...))
Base.setindex!(c::CoSorter, t::CoSorterElement, i...) = 
    (setindex!(c.sortarray, t.x, i...); setindex!(c.coarray, t.y, i...); c) 
Base.isless(a::CoSorterElement, b::CoSorterElement) = isless(a.x, b.x)
Base.Sort.defalg(v::C) where {T<:Union{Number, Missing}, C<:CoSorter{T}} = 
    Base.DEFAULT_UNSTABLE

With this you can do

julia> cur_x = rand(1:10, 10); cur_y = rand(10); using BenchmarkTools

julia> c = CoSorter(cur_x, cur_y); @btime sort!($c)
  30.111 ns (0 allocations: 0 bytes)
10-element CoSorter{Int64,Float64,Array{Int64,1},Array{Float64,1}}:
 CoSorterElement{Int64,Float64}(1, 0.40265954276014937)  
 CoSorterElement{Int64,Float64}(3, 0.07364824869474873)  
 CoSorterElement{Int64,Float64}(4, 0.7341851985737331)   
 CoSorterElement{Int64,Float64}(5, 0.0010439543093452297)
 CoSorterElement{Int64,Float64}(6, 0.8187812476982932)   
 CoSorterElement{Int64,Float64}(6, 0.8235149239112314)   
 CoSorterElement{Int64,Float64}(7, 0.891985683971432)    
 CoSorterElement{Int64,Float64}(7, 0.5556465187657322)   
 CoSorterElement{Int64,Float64}(8, 0.9217791949844834)   
 CoSorterElement{Int64,Float64}(10, 0.32895616976956177) 

The original arrays have now been co-sorted in-place.

For comparison, I think the next fastest is @anon94023334’s solution

julia> @btime sort(collect(zip($cur_x, $cur_y)); by=first);
  111.781 ns (5 allocations: 608 bytes)

EDIT: some corrections

6 Likes