Sparse high dimensional array support einsum?

Hello there,

I’m wondering if there is any effort in high dimensional sparse tensors with einsum support?

For example, I’m working on something that can be abstracted into high dimensional tensors and i need to perform contraction with it a lot.

One tensor I’m working with has d=100 dimensions.
T \in \mathbb{R}^{10 \times 10 \times 10 \cdots \times 10}, and only a few million of the elements have non-zero value. If i were to work with the dense Tensor, i would need to operate on 10^{100} elements. Which is expensive and non-ideal.

What i have are values with indexes. E.g. the first non zeros element was represented using a big index vector and a value vector (j_1, \cdots, j_{d-1}), (v_1, \cdots, v_{10}) such that j_i \in \{1 \cdots 10\}

There are some packages that provide support for Einstein summation notation:

thanks, I was looking at those, but none of them have sparse support.

1 Like

I don’t think there is, and I thought that generalising what works for sparse matrices was pretty tricky. Did you find

But with 100 indices, I guess you will need some other notation than writing them all out… what operations are you actually performing on these things? Maybe they are better thought of as dataframes or something?

1 Like

yeah, there are in dataframe right now. somehow einsum has much compact expression of the things i’m trying to achieve.
the operation i’m doing is simply
i \cdots j \to ij, where i ranges from 1 to d-1 .

Do you mean that you sum over all indices except the first and last? i there looks like it runs from 1 to 10, not 100. (But something else could label which indices don’t get summed, the nth & mth.) You could write this, but there’s little to gain over writing the loops yourself:

julia> inds = rand(1:10, 333, 100); vals = randn(333);
julia> out = zeros(10,10);
julia> using Tullio

julia> @tullio out[inds[r,1], inds[r,100]] += vals[r]
1 Like

There is a SparseArray type in the master branch of TensorOperations.jl that is compatible with the @tensor macro for tensor contractions. It uses a dictionary to store the non-zero values (i.e. a Dictionary Of Keys sparse array). However, the implementation is:

  • untested
  • incomplete
  • undocumented
  • not yet exported

All of these will hopefully be remedied in the near future, as this is needed for some of my own work. However, I am contemplating about such things as: the most appropriate name for this type, the best place to put it (inside TensorOperations.jl or as a separate package), …

However, be warned: I am using a tuple of indices as the keys in the dictionary, which I manipulate using the functionality of TupleTools.jl. In your use case this amounts to working with tuples of length 100. I am not sure this will be efficient. And even if the actual runtime is efficient, it might lead to huge compilation times.