However, I know that my input is sorted and that f(a) is a monotonic transformation, so I would like to use binary search instead. I wrote my own binary search routine inline, but then discovered that Julia has a searchsortedlast() function that should serve my purposes. However, I can’t get seem to get the by argument to work:
julia> searchsortedlast(a, 10, by= x->x^2)
5
This is identical to the output of searchsortedlast(a, 10).
Of course, you can transform a locally as follows:
julia> searchsortedlast(a .^ 2, 10)
3
But this defeats the purpose of binary search because it requires you to evaluate a[i]^2 at every index.
Hi, that’s an interesting find. Probably warants a PR to the docs if not a discussion about the expected behavior of this. note to self: search for existing issues before investigating
This is what happens:
You end up calling this function:
function searchsortedfirst(v::AbstractVector, x, lo::T, hi::T, o::Ordering)::keytype(v) where T<:Integer
u = T(1)
lo = lo - u
hi = hi + u
@inbounds while lo < hi - u
m = midpoint(lo, hi)
if lt(o, v[m], x)
lo = m
else
hi = m
end
end
return hi
end
Where lo and hi are your first and last indices respectively and o is an ordering that stores your transformation. In this case it’s the By ordering here.
The interesting part happens in the line
lt(o::By, a, b) = lt(o.order,o.by(a),o.by(b))
...
lt(o::ForwardOrdering, a, b) = isless(a,b)
Which in a condensed form means it checks isless(by(v[m]) , by(x)) at each step where by is the transformation you gave it.
So it really just applies the transformation to the given x value, too, which is probably not what you expected. If that’s intended or not needs to be answered by someone else though
it really just applies the transformation to the given x value, too
Well that’s … interesting. I have been wracking my brain for a use case for this and the best I can come up with is a “database query” type of task along the lines of, “In this list of employees which is sorted by income, who is the highest-paid employee who makes less money than Sam, whose salary is 20?”
So basically, you have to know ahead of time a “left inverse” of your by function at the target value. In the search task in the OP, this is equivalent to doing searchsortedlast(a, sqrt(10), by = x->x^2).
In my case, f() is basically a black box: a is an index of target objective values for an optimization problem, f(a) is a function that checks if a feasible solution exists attaining that objective, and the problem is to find the largest a that admits a feasible solution (link to Github code).
It would be nice to have an interface that works like findfirst() here.
It seems like there have been many PRs to add this functionality, but most of them made suboptimal implementation choices or misleading keyword argument names. I hope a more consistent search/sort interface can be a priority for 2.0.
I guess the takeaway is to ignore the by keyword altogether, which wouldn’t be that bad imho. It just needs a tiny nudge in the right direction in the docstrings of these functions.
Gotta check if there’s an issue set for 2.0… Edit: found nothing, may write something when I have time