Multiply many-matrices by many-vectors

Currently, no. The new version (which will be v1.0) is radically different (under the hood, not the API) from the previous versions (v0.x) in that TensorOperations.jl will mostly be about the API, i.e. the @tensor macro, and then the high-level implementation thereof for strided arrays.

Before, there were also very low-level implementations for adding two tensors in a permuted way, i.e. generalizing (and speeding up) the permutedims! function of Julia. Now, all of this has been moved out of TensorOperations.jl and actually lives in a package that can do much more general things called Strided.jl.

So TensorOperations.jl just provides an implementation that uses BLAS mul! whenever possible, and the kernel from Strided.jl for all other stuff.

My future goal is to make Strided.jl also have dedicated kernels for GPU arrays. This, however, requires that I first have access to a GPU (I have ordered some), that I learn to work with a GPU, and that I can then write a low-level kernel for the GPU. But once we’re there, TensorOperations.jl should also work with GPUs out of the box.

A more rapid approach might be to just implement the necessary methods (add!, trace! and contract!) for GPUArrays using the primitives that are already available, although last time I checked the permutedims! method had a very simplistic implementation for GPUArrays.

1 Like