Modify `r`th largest entry of vector in place

I can use sortperm() to get a permutation index that puts a in order:

3-element Array{Float64,1}:
 0.9546235783735173
 0.15267335513843228
 0.3175124947454555

julia> order = sortperm(a)
3-element Array{Int64,1}:
 2
 3
 1

julia> a[order]
3-element Array{Float64,1}:
 0.15267335513843228
 0.3175124947454555
 0.9546235783735173

I would like to do something like this to have the same effect as a[1] = 1, i.e. modify the largest entry of a in place.

julia> a[order][3] = 1
1

julia> a
3-element Array{Float64,1}:
 0.9546235783735173
 0.15267335513843228
 0.3175124947454555

Other than by making a copy of the sorted a and using invperm(), how can I accomplish this?

You would probably do findfirst for this. Something like

order = sortperm(a)
ind = findfirst(==(3), order)
a[ind] = 199

This is equivalent to using invperm()–actually, slower, because with invperm() we can “find” all the inds at the outset, which saves computation over many iterations.

I think the solution is a[order[3]] = 1, but I might just be getting lucky with permutation of periodicity 2.

What do you mean? This avoids making a copy of the sorted a, which is what you wanted, correct?

I am trying to cut down on both memory usage and computation cost.

Suppose a has n entries and we want to make k modifications. sortperm() costs n operations and each findfirst() (using linear search) costs another n, for a total cost of n + nk.

On the other hand, invperm() costs n operations and we only need to do it once, for a total cost of 2n + k.

On the other hand, if the code in the OP works, it should cost only n + k operations, if I am not mistaken.

To be clear, what I mean by the invperm() approach is

a_sorted = a[order]
a_sorted[3] = 1
a = a_sorted[invperm(order)]

Then why not just sort a and work with that?

While we’re at it, why not sort all vectors? The original order of a is meaningful.

For example, suppose a is a vector of scores on an exam, and the professor decides to bump up the bottom 10 scores to a 50%. The modified scores need to be associated with the names of the original students.

(This is a trivial example that could be coded easily with a list comprehension, but it illustrates the point.)

I’m confused. sort(a) returns a fresh vector, it doesn’t modify a at all. I’m proposing

a_sorted = sort(a)
third_highest = a_sorted[3]

A apologize for the confusion. I should have clarified that I was proposing making a copy. I’m not proposing sort!, which would modify in place.

Thanks for slogging through it. After further experimentation, my previous idea of a[order[3]] = 1 turns out to be correct.

The thing is, I do want to modify a, in place, using a function whose behavior depends on which “rank” of a on which it operates.

julia> a = rand(10)
10-element Array{Float64,1}:
 0.9679461796327264
 0.06561999889746284
 0.11738337626095086
 0.7960588746091037
 0.8587013475046998
 0.10976984317050542
 0.24876025355689024
 0.8062624044798408
 0.2840384536852898
 0.8210625203870776

Suppose I want to replace the smallest entry with the number 1, the second smallest with the number 64, and the eighth smallest with its logarithm.

Notice that I can do this, without copying a or doing any searching other than the original sort, as follows:

julia> order = sortperm(a)
10-element Array{Int64,1}:
  2
  6
  3
  7
  9
  4
  8
 10
  5
  1

julia> a[order[1]] = 1
1

julia> a[order[2]] = 64
64

julia> a[order[8]] = log(a[order[8]])
-0.19715602092229395

julia> a
10-element Array{Float64,1}:
  0.9679461796327264
  1.0
  0.11738337626095086
  0.7960588746091037
  0.8587013475046998
 64.0
  0.24876025355689024
  0.8062624044798408
  0.2840384536852898
 -0.19715602092229395

And the unmodified entries of a still appear in the unmodified order.

2 Likes

Yes. Remember that initially you wrote a[order][3], which is very different than a[order[3]]

They don’t look very different from here!

julia> a = rand(100); order = sortperm(a);

julia> a[order[26]], a[order][26]
(0.2968121750980379, 0.2968121750980379)

This is why I have the [scope] tag on this post—I’m not sure why the first one enables me to target an entry of a for modification in a way that the second doesn’t.

No, in terms of performance. a[order][3] first allocates the array a[order]. A whole vector the same size of a. a[order[3]] just does the getindex.

2 Likes

The reason is the order of operations. The first one is equivalent to

i = order[26]
a[i] = 0 # a is modified

The second is equivalent to

b = a[order]
b[26] = 0 # a is not modified

Interestingly, you could make b from above be a view of a (unfortunately adding some overhead to each setindex operation, I should note)

order = sortperm(a)
b = @view a[order]
b[26] = 0 # a is also modified!
2 Likes