PartitionBy, retaining key

Unfortunately, stateful transdcuers like Unique cannot be used after parallelizable transdcuer like ReducePartitonBy. It’s kind of a cost of parallelizability. There can be a better design to allow this but it’s a bit tricky to do ATM.

(Though the unwrap method error is actually a bug. Thanks for sharing the code!)

Meanwhile, I think the easiest approach might be to just cook up your partitionby using FGenerators:

julia> using FGenerators

julia> @fgenerator function partitionby(f, xs)
           buffer = eltype(xs)[]
           key = f(first(xs))
           for x in xs
               y = f(x)
               if !isequal(y, key)
                   @yield key => buffer
                   empty!(buffer)
                   key = y
               end
               push!(buffer, x)
           end
       end
partitionby (generic function with 1 method)

julia> partitionby(x->isnothing(tryparse(Int, x)), charstrings) |>
           Map(((k, v),) -> (k, prod(v))) |>
           Filter(==(0) ∘ first) |>
           Map(x->parse(Int, x[2])) |>
           Unique() |>
           collect
3-element Vector{Int64}:
 123
  34
   8

Note: xs -> partitionby(f, xs) is not a transducer so pre-processing of xs cannot be done with transdcuer.

2 Likes