Mapreduce for binary operations on SparseVector

SparseArrays has support for mapping a binary function over two SparseVectors.
For example here.

But, I don’t see similar support for mapreduce. The function linked above can be modified to reduce the output vector as it is generated rather than storing it.

Furthermore, for efficiency it might make sense to pass one or two additional, single-argument, functions to be applied when one of the arguments is zero. In fact, this could be implemented for map as well as mapreduce.

Did I just not find it?

Even if the code for doing essentially mapreduce is hidden in sparsevector.jl, it is not used in Base.mapreduce. Passing SparseVectors to the latter falls back to a generic implementation.

I opened an issue.

EDIT: It looks like implementing a general mapreduce for sparse vectors is difficult. But, writing tools for some special cases should not be too difficult.

2 Likes

Interesting, why would it be too difficult? It seems to be analogous to a merge in case the indices of the sparse vectors are ordered?

Suppose I want mapreduce(f, op, a, b) where a and b are SparseVector. If op == +, then I can count the number of times f(0, 0) occurs. But, if op(x, y) = x < 100 ? x : x + y, things are more complicated. I need to know the value of the accumulator when the f(0, 0) occur. You can write useful methods if you make some assumptions on f and op.