[ANN] Tensorial.jl: Statically sized tensors and related operations for Julia

Tensorial.jl

I’d like to announce Tensorial.jl, a package I’ve developed and actively use in my research. Tensorial.jl provides statically sized Tensor similar to SArray from StaticArrays.jl, but with several additional features designed to make tensor operations more convenient. These include:

  • Contraction, tensor product (), and a flexible @einsum macro for Einstein summation convention
  • A @Symmetry macro to define tensor symmetries, which eliminates unnecessary calculations
  • Automatic differentiation via gradient and hessian functions, leveraging ForwardDiff.jl

Compatibility with AbstractArray

Based on advice from @ExpandingMan, I’ve recently made the package compatible with AbstractArray. So, I believe people can easily switch from SArray to Tensor by simply replacing SArray, SMatrix, and SVector with Tensor, Mat, and Vec, respectively. To check compatibility, @ExpandingMan has kindly contributed—thank you!

Comparison with other tensor packages

While there are several packages available for tensor operations, Tensorial.jl is specifically designed for working with small-sized tensors. The most similar package in this regard is Tensors.jl, which keeps its internal code simple by restricting tensor size. In contrast, I sought to overcome this limitation, and now Tensorial.jl can handle tensors of arbitrary sizes, offering greater flexibility for a wider range of applications.

Quick start

julia> using Tensorial

julia> x = Vec{3}(rand(3)); # constructor similar to SArray

julia> A = @Mat rand(3,3); # @Vec, @Mat and @Tensor, analogous to @SVector, @SMatrix and @SArray

julia> A ⋅ x ≈ A * x # single contraction (⋅)
true

julia> A ⊡ A ≈ tr(A'A) # double contraction (⊡)
true

julia> x ⊗ x ≈ x * x' # tensor product (⊗)
true

julia> (@einsum x[i] * A[j,i] * x[j]) ≈ x ⋅ A' ⋅ x # Einstein summation (@einsum)
true

julia> S = rand(Tensor{Tuple{@Symmetry{3,3}}}); # specify symmetry S₍ᵢⱼ₎

julia> SS = rand(Tensor{Tuple{@Symmetry{3,3}, @Symmetry{3,3}}}); # SS₍ᵢⱼ₎₍ₖₗ₎

julia> inv(SS) ⊡ S ≈ @einsum inv(SS)[i,j,k,l] * S[k,l] # it just works
true

julia> δ = one(Mat{3,3}) # identity tensor
3×3 Tensor{Tuple{3, 3}, Float64, 2, 9}:
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0

julia> gradient(identity, S) ≈ one(SS) # ∂Sᵢⱼ/∂Sₖₗ = (δᵢₖδⱼₗ + δᵢₗδⱼₖ) / 2
true
21 Likes
  1. Why does Tensorial.jl depend on StaticArrays.jl
  2. Presumably it’d be better to depend on ChainRulesCore.jl, rather than on ForwardDiff.jl? If necessary to depend on ForwardDiff, I guess a package extension (weak dependency) would be a better choice.

Thank you for the feedback.

Tensorial.jl uses SArray when generating static indices that map tensor indices to actual data indices, taking symmetry into account. The code looks like this:

function tensor2data_map(::Type{TT}) where {TT <: Tensor}
    ...
end

This function returns a unique map depending on TT, so I use a @generated function to generate the map only once for performance purposes. This necessities the use of immutable arrays.

Additionally, to achieve high performance for contraction, Tensorial.jl reduces the contraction to simple matrix multiplication. Since this matrix multiplication is already optimized in StaticArrays.jl, I simply call the method.

I’m not familiar with ChainRules.jl and weak dependency, but I’ll take a look—thank you for your comments.

1 Like

Welcome, and nice to see new packages!

I agree these things are array-like. But this creates some surprises, in that the package also re-uses existing functions with quite different meanings. It seems that generic code can almost never work on both this package’s AbstractVec and Base’s AbstractVector.

One way out would be to use new symbols, instead? Or symbols that already have the meanings you want.

You give the obvious meaning, and Base/LinearAlgebra do not use it at present. An attempt to add it (with this meaning) failed and became the tiny package TensorCore.jl… which it was hoped other packages would extend. That also defines (=== boxdot) with the “single contraction” meaning you give to (⋅) === dot. (This idea is useful with Arrays too.) It has no symbol for “double contraction” at present.

julia> x ⊗ x ≈ x * x'  # from the readme
ERROR: ArgumentError: adjoint for `AbstractVec` is not allowed

julia> x ⊗ x  # ok!
3×3 Tensor{Tuple{3, 3}, Float64, 2, 9}:
 0.0896747  0.256252  0.0724118
 0.256252   0.732258  0.206922
 0.0724118  0.206922  0.0584721

julia> TensorCore.tensor(x, x)  # agrees
3×3 Matrix{Float64}:
 0.0896747  0.256252  0.0724118
 0.256252   0.732258  0.206922
 0.0724118  0.206922  0.0584721

julia> collect(x) .* collect(x)'  # agrees, as it should
3×3 Matrix{Float64}:
 0.0896747  0.256252  0.0724118
 0.256252   0.732258  0.206922
 0.0724118  0.206922  0.0584721

julia> x == collect(x) && x isa AbstractVector{<:Real}
true

julia> x .* collect(x)'  # this is very surprising, not what broadcasting normally does
1×3 Matrix{Vec{3, Float64}}:
 [0.0896747, 0.256252, 0.0724118]  …  [0.0724118, 0.206922, 0.0584721]

julia> x ⋅ x  # ok
0.8804047336475448

julia> A == collect(A) && A isa AbstractMatrix{<:Real}
true

julia> collect(A) ⋅ A  # ok
2.377269789220256

julia> A ⋅ A  # very surprising, not at all what LinearAlgebra.dot does
3×3 Tensor{Tuple{3, 3}, Float64, 2, 9}:
 0.737822  0.0983617  0.507827
 1.02141   0.464815   0.910331
 0.535773  0.0776601  0.593985

julia> @which ⋅
LinearAlgebra

julia> A * x
ERROR: use `⋅` (`\cdot`) for single contraction and `⊡` (`\boxdot`) for double contraction instead of `*`

julia> A * collect(x)  # ok
3-element Vector{Float64}:
 0.35912153555872145 
 0.8535782548470549
 0.3073468130683834

julia> x ⋅ A    # this is a little weird, but at least it would normally be an error
3-element Vec{3, Float64}:
 0.7239929297935345
 0.5800305650422259
 0.8235245749587006

julia> TensorCore.boxdot(x, A)  # same "single contraction" meaning
3-element Vector{Float64}:
 0.7239929297935345
 0.5800305650422259
 0.8235245749587007

julia> dot(A⊗A, x) |> summary
"3×3×3 Tensor{Tuple{3, 3, 3}, Float64, 3, 27}"

julia> dot(A⊗A, x) ≈ TensorCore.boxdot(A⊗A, x)  # same meaning
true

julia> @einsum (i,j) -> A[i,k] * A[k,j]  # I wish this were spelled C[i,j] := , but just a widespread package convention, not Base/LinearAlgebra
3×3 Tensor{Tuple{3, 3}, Float64, 2, 9}:
 0.737822  0.0983617  0.507827
 1.02141   0.464815   0.910331
 0.535773  0.0776601  0.593985
1 Like

Hi, thank you for the feedback.

Errors

I think you may have tried an older version of Tensorial.jl. Please check that you are using version v0.17 (this version is compatible with AbstractArray ), and you should not encounter the following errors:

Broadcast

Yes, this is a different implementation compared to normal arrays. In Tensorial.jl, Tensor behaves more like a scalar than an Array when broadcasting with Array s (see the documentation). This idea comes from Gridap.jl and is quite useful when broadcasting between a Tensor and Array of Tensors. For example:

julia> positions = [rand(Vec{2}) for i in 1:10];

julia> velocities = [rand(Vec{2}) for I in 1:10];

julia> dt = 1.0;

julia> velocities .= Vec(1,0); # initialize using broadcast

julia> @. positions += velocities * dt # update positions using broadcast

I understand that this behavior is very different from Array , but since Tensor in Tensorial.jl can only store Real values, I believe it won’t be too confusing once get used to it.

Contraction

Thank you for the information. I wasn’t aware of the history behind Julia and TensorCore.jl. I’ll take a look.

This is because Tensorial.jl uses as single contraction. So, A \cdot A = A_{ik}A_{kj}.

This represents x \cdot A = x_i A_{ij}. Another example:

julia> B = rand(Tensor{Tuple{3,3,3}});

julia> B ⋅ x
3×3 Tensor{Tuple{3, 3}, Float64, 2, 9}:
 1.13386   1.14646   0.607125
 0.758395  1.17262   0.458403
 0.439887  0.586323  0.828848

because B \cdot x = B_{ijk}x_k.

Oh indeed, sorry I got an old version. With Tensorial v0.17.6 the errors above are resolved.

This is because Tensorial.jl uses as single contraction. So, A⋅A=AikAkj.

Yes. I agree that this is a desirable operation, and that . sometimes has this meaning on paper, and in other languages (like Mathematica).

But my point is that LinearAlgebra.dot(::AbstractMatrix, ::AbstractMatrix) already has a meaning. Such meanings are the core idea of Julia’s generic functions & abstract types.

Opting in to these functions / types usually offers the benefit that code can implement the same mathematical logic on distinct computer representations. And it can be read by users familiar with the general rules, not knowing the specific types. But this is explicitly broken here, by using the same functions / types to mean completely different operations. That’s what I think is so surprising.

And one easy way to avoid this surprise is by picking a distinct symbol.

Re broadcasting & array rules, I similarly urge you to take existing conventions seriously. I realise that breaking them may be convenient for some operations you have in mind now. Your users are later going to have other operations in mind, and will be confused. velocities .= Ref(Vec(1,0)) is standard, works now, and allows e.g. velocities .= Ref([1,0]) where the Vector is auto-converted.

2 Likes

I understand your point and agree with your opinion. Initially, I explored alternative symbols, but couldn’t find a suitable option. Perhaps I should follow the conventions used in TensorCore.jl.

For broadcasting, thank you for your advice. I’ll consider it.

2 Likes

Since you are using small arrays you might want to try FastDifferentiation.jl for computing your derivatives. It generates very efficient code when expression size is not too large. I would expect it to execute faster than ForwardDiff.jl in this case.

3 Likes

For the same reason, I don’t think ChainRulesCore.jl would be very relevant here (or maybe just as an extension to make code differentiable that would otherwise error). In terms of performance, for small inputs, you’re definitely better off with forward (or symbolic) autodiff.

1 Like

I’ve given your comments more thought. While I agree that strictly following Julia’s definitions may be better in some respects, I don’t think that’s always the case. For example, as you suggested, I could use a different symbol like for single contraction, but doing so can make the code look quite awkward. Additionally, some people might find it strange that (rand(3,3,3) ⋅ rand(3,3,3)) results in a Real for tensor operations. The truth is, there’s no solution that 100% of people will agree on. I believe it’s fine for a Julia package to resemble Mathematica in some ways, rather than strictly adhering to pure Julia conventions. Maybe, I should add a warning in the documentation stating that Tensorial.jl breaks the original definition of the dot function in LinearAlgebra.jl.

Anyway, your comments are quite valuable, so thank you very much. Regarding broadcasting, I’m considering following your suggestions.

Thank you for the information, @brianguenter! I will definitely check out the package.

Thank you always for your kind advice, @gdalle :grin:

1 Like

If you really want to use the symbol with new semantics, you can do that, but do it by introducing your own function, instead of misusing LinearAlgebra.dot. The latter is a hostile way to create confusion and ecosystem bugs.

1 Like

Yes, I exactly plan to do this. By the way, will the \bullet symbol be supported in Julia in the future? I initially tried to use it for single contraction, but it isn’t currently supported.

@mcabbott, @nsajko, I’d like to hear your advice. If I use for single contraction, following TensorCore.jl, and ⊡₂ for double contraction, what do you think? Is it too fancy?

I agree with this. Functions defined in the base language / standard libraries have precise semantics, which users rely on even when they call these functions types from other packages.

If you really think that users of Tensorial.jl expect LinearAlgebra.dot (which LinearAlgebra.⋅ is an alias for) to mean something else than a dot product, perhaps a middle ground would be to throw an informative error when users try to apply it to one of your tensors?

1 Like

Okay, I understand that everyone in the community strictly adheres to the semantics in base and stdlib, and I have decided to follow this convention as well. The upcoming new version is now based on TensorCore.jl and does not break dot, as follows:

julia> using Tensorial

julia> x = rand(Vec{3});

julia> A = rand(Mat{3,3});

julia> @which boxdot
TensorCore

julia> @which tensor
TensorCore

julia> A ⊡ x ≈ A * x # single contraction (⋅)
true

julia> A ⊡₂ A ≈ A ⋅ A # double contraction (⊡)
true

julia> x ⊗ x ≈ x * x' # tensor product (⊗)
true

julia> boxdot(x, A) ≈ boxdot(Array(x), Array(A))
true

julia> tensor(x, A) ≈ tensor(Array(x), Array(A))
true
5 Likes

Have you considered defining

abstract type AbstractTensor{S <: Tuple, T, N} <: StaticArray{S, T, N} end

?

I often work with StaticArrays and tensors from Tensors.jl and it’s a bit annoying that I cannot define an function f(vec::StaticVector{N,T}) that would accept both SVector and Vec types. I can of course resort to f(vec::AbstractVector{T}) but then I have to in principle make a separate check for the correct length of the vector.

I guess this goes back to the lack of support for traits in Julia, forcing to make difficult decisions when defining types.

1 Like

I’ll be happy to help you figure out how to use the FastDifferentiation.jl package. Let me know if you run into problems.

I’ve also worked on differentiation algorithms specifically for tensor contraction. Never published the work but if you are interested let’s talk more.

2 Likes

Tensorial is a very nice package, but it does highlight many of the problems with stack allocated arrays in the Julia ecosystem. One of the very nice things about Tensorial is that it avoids MArray, a horrendous hack within StaticArrays that is the source of many performance pitfalls. However, it’s not all good news, MArray exists for a reason, in not having it one sacrifices a great amount of flexibility. I am hopeful that developments in the core language with Memory will make it possible to eliminate MArray at some point in the not so-distant-future, but I’m not privy to the details of how that would work.

Another issue that this package highlights is how easy it is for other packages to inadvertently cause costly heap allocations. For example, ideally one could use DifferentiationInterface for this package without any need for dedicated AD code in Tensorial at all, but this will cause performance crippling allocations since DI has no way of knowing how to deal with Tensorial arrays rather than to create Array, particularly becuase, lacking MArray, there is no viable similar.

2 Likes

I think the key point here is that you’ve decided to subtype AbstractArray. That’s where the implicit commitment to not violate the existing meaning of LinearAlgebra.dot(::AbstractArray, ::AbstractArray) comes from. Someone else in some other package may have written a function that is supposed to work on any AbstractArray and uses LinearAlgebra.dot internally, assuming its established semantics.

If you decide to not subtype AbstractArray after all, you might be able to get away with implementing a more idiosyncratic method of LinearAlgebra.dot. But you may find that it’s better to create your own function Tensorial.dot anyway.

Edit: reading the LinearAlgebra.dot docstring:

dot also works on arbitrary iterable objects, including arrays of any dimension, as long as dot is defined on the elements.

dot is semantically equivalent to sum(dot(vx,vy) for (vx,vy) in zip(x, y)), with the added restriction that the arguments must have equal lengths.

I may have to retract the above. The semantics of LinearAlgebra.dot are explicitly specified for a much broader set of types than just AbstractArray.