Searchsorted by attribute

Suppose I have an array inventory::Vector{InventoryObject} that is sorted by the name attribute of each element (sort with by=(obj->obj.name))

The InventoryObject is defined as

@kwdef struct InventoryObject
    name::String
    domain::String = "jl"
    role::String
    priority::Int64 = 1
    uri::String
    dispname::String = "-"
end

Is there some way to use searchsorted to find the range of indices for elements with a given x=name? I’m not sure I really understand the by and lt arguments to searchsorted, but they don’t seem to be completely helpful here. I’m missing a way to apply a transformation to each element of inventory before comparing to the search value x.

The only idea I’ve been able to come up with so far is to define a view like

struct InventoryNameView <: AbstractArray{Int, 1}
    inventory::Vector{InventoryObject}
end

Base.size(view:: InventoryNameView) = size(view.inventory)
Base.getindex(view:: InventoryNameView, i::Int) = view.inventory[i].name

and call searchsorted on the InventoryNameView. Did I overlook something?

P.S.: I’d prefer to do this without adding new dependencies, with standard-library functions only.

Isn’t it just searchsorted(xs, x, by=x->x.name)?

No, the by would get applied to x as well, as far as I can tell:

julia> inventory = [
       InventoryObject(name="a", role="any", uri="-"),
       InventoryObject(name="b", role="any", uri="-"),
       InventoryObject(name="b", role="func", uri="-"),
       InventoryObject(name="b", role="macro", uri="-"),
       InventoryObject(name="c", role="any", uri="-")
       ]

julia> searchsorted(inventory, "b"; by=(x -> x.name))
ERROR: type String has no field name

To be fair, my initial intuition also was that this should work. The documentation of searchsorted isn’t exactly clear on this, so this error might even be a bug.

Oh, actually, this might work:

julia> searchsorted(inventory, InventoryObject(name="b", role="", uri=""); by=(x -> x.name))
2:4

So does this:

julia> struct DummyValue
       name::String
       end

julia> searchsorted(inventory, DummyValue("b"); by=(x -> x.name))
2:4

Although I’m not sure it’s exactly elegant.

It also means that strictly speaking, the docstring of searchsorted is lying when it says “Return the range of indices of a which compare as equal to x”, since the values in the returned range clearly do not compare as equal to x!

cc @Lilith

1 Like

Jeremie Knuesel (@knuesel on gihub) fixed the docstring to accurately document this behavior in https://github.com/JuliaLang/julia/pull/48387. In 1.10, the docstring reads

help?> searchsorted
search: searchsorted searchsortedlast searchsortedfirst

  searchsorted(v, x; by=identity, lt=isless, rev=false)

  Return the range of indices in v where values are equivalent to x, or an
  empty range located at the insertion point if v does not contain values
  equivalent to x. The vector v must be sorted according to the order defined
  by the keywords. Refer to sort! for the meaning of the keywords and the
  definition of equivalence. Note that the by function is applied to the
  searched value x as well as the values in v.

  The range is generally found using binary search, but there are optimized
  implementations for some inputs.

  See also: searchsortedfirst, sort!, insorted, findall.

  Examples
  ≡≡≡≡≡≡≡≡

  julia> searchsorted([1, 2, 4, 5, 5, 7], 4) # single match
  3:3
  
  julia> searchsorted([1, 2, 4, 5, 5, 7], 5) # multiple matches
  4:5
  
  julia> searchsorted([1, 2, 4, 5, 5, 7], 3) # no match, insert in the middle
  3:2
  
  julia> searchsorted([1, 2, 4, 5, 5, 7], 9) # no match, insert at end
  7:6
  
  julia> searchsorted([1, 2, 4, 5, 5, 7], 0) # no match, insert at start
  1:0
  
  julia> searchsorted([1=>"one", 2=>"two", 2=>"two", 4=>"four"], 2=>"two", by=first) # compare the keys of the pairs
  2:3

The existing comparison archetecture does not support the notion of an opject that the by function has already been applied to. However, this could become possible after Use shared pre-computation for `by` and `perm` orderings. by LilithHafner · Pull Request #52033 · JuliaLang/julia · GitHub.

2 Likes