How to use the `by` argument to `searchsortedlast()`

I am trying to find the index of the last element of `a` such that `f(a) <= 10`. You can do this with `findlast`:

``````julia> a = [1, 2, 3, 4, 5]
5-element Vector{Int64}:
1
2
3
4
5

julia> findlast(x -> x^2 ≤ 10, a)
3
``````

However, I know that my input is sorted and that `f(a)` is a monotonic transformation, so I would like to use binary search instead. I wrote my own binary search routine inline, but then discovered that Julia has a `searchsortedlast()` function that should serve my purposes. However, I can’t get seem to get the `by` argument to work:

``````julia> searchsortedlast(a, 10, by= x->x^2)
5
``````

This is identical to the output of `searchsortedlast(a, 10)`.

Of course, you can transform `a` locally as follows:

``````julia> searchsortedlast(a .^ 2, 10)
3
``````

But this defeats the purpose of binary search because it requires you to evaluate `a[i]^2` at every index.

Hi, that’s an interesting find. Probably warants a PR to the docs if not a discussion about the expected behavior of this. note to self: search for existing issues before investigating

This is what happens:
You end up calling this function:

``````function searchsortedfirst(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::keytype(v) where T<:Integer
u = T(1)
lo = lo - u
hi = hi + u
@inbounds while lo < hi - u
m = midpoint(lo, hi)
if lt(o, v[m], x)
lo = m
else
hi = m
end
end
return hi
end
``````

Where `lo` and `hi` are your first and last indices respectively and `o` is an ordering that stores your transformation. In this case it’s the `By` ordering here.
The interesting part happens in the line

``````if lt(o, v[m], x)
``````

which calls these two:

``````lt(o::By, a, b) = lt(o.order,o.by(a),o.by(b))
...
lt(o::ForwardOrdering, a, b) = isless(a,b)
``````

Which in a condensed form means it checks `isless(by(v[m]) , by(x))` at each step where `by` is the transformation you gave it.
So it really just applies the transformation to the given `x` value, too, which is probably not what you expected. If that’s intended or not needs to be answered by someone else though

Edit:
There are at least two related issues already:
https://github.com/JuliaLang/julia/issues/35381
https://github.com/JuliaLang/julia/issues/9429

1 Like

it really just applies the transformation to the given `x` value, too

Well that’s … interesting. I have been wracking my brain for a use case for this and the best I can come up with is a “database query” type of task along the lines of, “In this list of employees which is sorted by income, who is the highest-paid employee who makes less money than Sam, whose salary is 20?”

``````salaries = [
("Bob", 10)
("Bill", 15)
("Alice", 50)
("Junior", 800)
]
searchsortedlast(salaries, ("Sam", 20), by= x->x[2])
# 2
``````

So basically, you have to know ahead of time a “left inverse” of your `by` function at the target value. In the search task in the OP, this is equivalent to doing `searchsortedlast(a, sqrt(10), by = x->x^2)`.

In my case, `f()` is basically a black box: `a` is an index of target objective values for an optimization problem, `f(a)` is a function that checks if a feasible solution exists attaining that objective, and the problem is to find the largest `a` that admits a feasible solution (link to Github code).

It would be nice to have an interface that works like `findfirst()` here.

Here is an awful workaround:

``````a = [1, 2, 3, 4, 5]
my_square(x::NamedTuple) = x.k
my_square(x::Number) = x^2
searchsortedlast(a, (; k=10), by = my_square)
# 3
``````

Edit: Oh, I see your edit now. Thanks!

Edit 2: This is quite a rabbit hole:

It seems like there have been many PRs to add this functionality, but most of them made suboptimal implementation choices or misleading keyword argument names. I hope a more consistent search/sort interface can be a priority for 2.0.

Agreed

I guess the takeaway is to ignore the `by` keyword altogether, which wouldn’t be that bad imho. It just needs a tiny nudge in the right direction in the docstrings of these functions.

Gotta check if there’s an issue set for 2.0… Edit: found nothing, may write something when I have time