I’m pleased to introduce the newest BioJulia package: KWayMerges.jl - currently undergoing registration.
This tiny package provides the KWayMerger
type, which lazily merges multiple sorted iterators into a single, sorted stream. It’s generally faster than appending all the iterators and calling sort!
.
For example:
julia> using KWayMerges
julia> vs = [[4,6], [7,9], [1,5]];
julia> [last(i) for i in KWayMerger(vs)]
6-element Vector{Int64}:
1
4
5
6
7
9
Currently, it’s implemented using a heap. The Wikipedia article claims a tournament tree is faster, so soon-ish I’ll try re-implementing the KWayMerger
using one of those and see if it’s faster. But for now, I just wanted something that I knew worked.
10 Likes
This maybe a basic Julia question, but:
help?> KWayMerger
...
This iterator yields (index::Int, x::T) elements
Is there a reason (performance or otherwise) that it can’t yield @NamedTuple{fromiter::Int, value::T}
elements instead? Remembering what the two tuple elements mean is a small thing in isolation, but can get pretty annoying and negatively affect readability when part of a large complex codebase. NamedTuples would still allow numeric-index-based and first
/last
access if preferred, but also allow nicer, more ergonomic access and display. I don’t know what the other side of the tradeoff is though, if any.
That’s a good idea. I should probably also add a sort-like API supporting by
, lt
, rev
and order
, if those are zero-cost.
1 Like