[ANN] TensorAlgebra.jl: Taking covariance seriously

I suppose in the specific case of TensorAlgebra.jl I could just implement forms with const OneForm = Covector. Since there is a kind of tensor duck typing happening everything else follows.

To formalize it a bit more to differentiate vectors/forms from basis vectors/forms and have “type checking” like (n m) tensors you could build something on top of TensorAlgebra.jl

You have a point. That is quite important for the index raising and lowering that plays such a big role in general relativity. In my defence, I do quantum mechanics—raising and lowering are nonlinear operations in Hilbert space, so we use them less than they do in Euclid and Minkowski space.

Since everyone uses Einstein notation for this, I’m thinking that names might be more useful than numbers. E.g.

julia> using Tensars

julia> S = NamedTensar(rand(3,4,5,6), (:i, :j), (:k, :l))
i[3]×j[4] ← k[5]×l[6] Tensar{Float64}

julia> S*NamedTensar(rand(6,5), (), (:l, :k))
i[3]×j[4] ← scalar Tensar{Float64}

julia> size(S)
(Multiset([3, 4]), Multiset([5, 6]))

or maybe

julia> size(S)
((i=3, j=4)), (k=5, l=6))

or the right way to do it in principle

julia> size(S)
(Dict(:j=>4, :i=>3), Dict(:k=>5, :l=>6))

Numbers are a special case of names, so this would allow

julia> using TensorAlgebra

julia> g = NamedTensar(rand(5,5), (2,), (1,))

julia> g_down_up = TensorAlgebra.Tensor(g)

The question is what vector space to use as the type parameter. Can TensorAlgebra represent the vector space Array{T,N}?

It also provides a neat way to do partial contractions

julia> S*NamedTensar(rand(5), (), (:k))
i[3]×j[4] ← l[6] Tensar{Float64}

Discourse says that hundreds of people are reading this discussion. That makes me feel motivated to spend the weekend getting Tensars to a point where I’m willing to register it.

While it doesn’t distinguish up & down, I have a package which allows contractions by name:

julia> using NamedPlus, TensorOperations

julia> S = named(rand(3,4,5,6), :i, :j, :k, :l)
3×4×5×6 NamedDimsArray(::Array{Float64,4}, (:i, :j, :k, :l)):
[:, :, 1, 1]:
      → :j
↓ :i  0.62051   0.398822  0.825464  0.583103
      0.942424  0.350063  0.214688  0.0988411
      0.387568  0.593296  0.608857  0.0822716
...

julia> contract(S, named(rand(6,4), :l, :j)) # contracts all shared names
3×5 NamedDimsArray(::Array{Float64,2}, (:i, :k)):
      → :k
↓ :i  3.70986  3.79836  5.23238  5.52598  4.97483
      4.18305  4.06745  4.87871  3.94772  4.47412
      4.91262  4.89536  5.23945  4.35251  4.57636

It doesn’t overload * for this, as the rule is that all NamedDimsArrays behave exactly like arrays under operations not mentioning names (except propagating names, and giving an error if they clash).

2 Likes

That’s similar to how contractions work in Grassmann, except that instead of using named Symbols for indices, in Grassmann the highest performance possible is gained by only using binary bit encoding and bit shift fiddling for the tensor indices. The single 64bit encoding used in Grassmann for up to 64 indices of tensors allows much greater performance than using a list of Symbol for indices. So if anyone is interested in doing this in a performance oriented way, building on my foundations I am seeking higher performance. So, I would use a DirectSum.SubManifold to handle the conctraction of the indices of multi-dimensional arrays, and then use generated code to allocate the array. The DirectSum and Grassmann index contraction algebra was specifically designed for handling index contractions performantly ahead of compile time. Currently it’s only being applied to graded algebras and dyadic tensors, but I will consider adding support for what’s being discussed here as well.

1 Like

Note that the design of NamedDims puts these symbols in the type, and does almost all of the work with them at compile-time. Thus @code_warntype contract(S, S2) already knows the indices of the output.

3 Likes

That’s good to know! Note that in my case, I also need the performance of the compilation time to be good, which is why the pre-compilation computation was done with bit encodings instead of Symbol. Since I may be switching between many specialized algebras in a single session, compilation times need to be as quick as possible. Later on, an optional feature was added to customize the bit index with a Symbol but the feature has never been used or needed yet, so it is hidden away for now.

1 Like

@thisrod, how do you mean that raising and lowering are nonlinear operations in Hilbert space? Could you elaborate?

In TensorKit.jl, vector spaces could be regular or dual spaces, irrespective of whether they appear in the domain or the codomain of the linear map associated with a particular interpretation of the tensor. Of course, a normal vector space in the codomain becomes a dual space when its moved (permuted) to the domain, but you could also start with a dual space in the codomain.

@chakravala, in Grassmann.jl and related packages, I understand that all tensors are living in some tensor bundle associated with a manifold. Does this mean that the different vector spaces involved can only be of two types, the tangent space and cotangent space of the manifold, and in particular, they all have the same dimension? While I very much like differential geometry, that is certainly too restricted for many use cases where tensors are used in quantum physics.

No, that is incorrect. In Grassmann algebra, we specifically work over the topology on the entire combinatorial range of submanifolds from a generating manifold. At the root is a generating tensor bundle with 2^n submanifolds.

@juthohaegeman is talking about whether he can represent objects living in bundles other than the tensor (co)tangent bundle. For instance, in QCD one might desire an object that is a rank 3 tensor F^a_{\alpha \beta} where two of the indices (\alpha,\beta) are from the tangent bundle (spacetime indices taking on values 0:3) and one (a taking values 1:8) is an SU(3) colour index https://en.wikipedia.org/wiki/Gluon_field_strength_tensor.

In this case, you don’t want any product, rotation or combination of space-time tensors to be able to construct a colour-valued object. They live in completely separate spaces.

This can sort of be done in Grassmann.jl (maybe not for the QCD example since there’s heterogenous dimensionalities involved?), but I think some would find it’s construction a little unnatural. The end result I think would be more like a Kaluza Klein theory than the sort of bundle structures we desire.

3 Likes

The article you linked includes a Grassmann algebra formulation, @Mason.

The gluon color field can be described using the language of differential forms, specifically as an adjoint bundle-valued curvature 2-form (note that fibers of the adjoint bundle are the su (3) Lie algebra); where the gluon field is a vector potential 1-form corresponding to G and ∧ is the (antisymmetric) wedge product of this algebra, producing the structure constants f abc . The Cartan-derivative of the field form (i.e. essentially the divergence of the field) would be zero in the absence of the “gluon terms”, i.e. those {\displaystyle {\boldsymbol {\mathcal {A}}}}{\boldsymbol {\mathcal {A}}}|0x0 which represent the non-abelian character of the SU(3).

As it says on there, you can use Grassmann algebra for working on that. Higher rank tensors are of course possible with it also… so I don’t get your point.

Wow. Thank you for the comment @juthohaegeman :raised_hands:

Sorry for my slow response. You inadvertently (and welcomely) sent me off trying to learn about units and counits and I ended up looking at the tensor Hom adjunction :scream:

[Note: Plus, I got a little distracted :sweat_smile: ]

Looking into tensor Hom adjunction brought me to Understanding the tensor-hom adjunction intuitively. One of the answers had this comment:

arctic tern’s answer is very nice and complete. But less formally, the adjunction really says something quite simple: By the universal property of the tensor product, a linear map out of U⊗V is the same thing as a bilinear map from U×V. (I.e. the tensor product is built precisely to encode bilinearity.) And a bilinear map from U×V is the same thing as a linear map from U to the linear maps out of V, by currying. (I.e., process the bilinear map from U×V one slice \{u\}×V≅V at a time, for all u∈U, instead of all at once.) Naturality just means that this identification behaves well under linear maps.

That is all good and intuitive :+1:

But I am a little confused about your reason for defining a tensor as a map:

\tau: K\to U\otimes V\otimes W.

In TensorAlgebra.jl, I define tensors as maps

\tau: U\otimes V\otimes W\to K.

Then, from the comment on tensor Hom adjunction above, this is saying a tensor is a multilinear map

\tau: U\times V\times W\to K.

That is how I understand tensors.

It feels like your arrow is going backwards :thinking:

You said:

They are not the “same” but they are naturally isomorphic (according to the tensor Hom adjunction unless I misunderstood), which is about as close to “same” as you can get and for practical purposes, especially in a Julia package, I think it is safe (and perfectly clear) to let these naturally isomorphic maps be the “same” rather than have different types for each one. This is the main issue I have with TensorKit.jl :thinking:

TensorAlgebra.jl was largely inspired by TensorKit.jl. It looks great (:raised_hands:) , but this one issue helped inspire me to write a separate package.

I think I see your point, but I don’t think it means we need to resort to keeping all those naturally isomorphic maps distinct. For example, you can have Lie algebra-valued forms, but you’re no longer talking about vector spaces, but modules and groupoids.

Nevertheless, I’d still define tensors as maps

\tau: U\otimes V\otimes W\to G,

where now U, V and W are G-valued and G could be a Lie algebra or some group with appropriate action.

Thank you for giving me a lot to think about :raised_hands: I’ll try to think about the examples you gave above and see how TensorAlgebra.jl might be able to accommodate them.

Cheers

1 Like

Hi @thisrod :wave:

You asked:

Resorting to the textbook notation, we have

v = v^\mu e_\mu

and

g = g_{\mu,\nu} dx^\mu\otimes d^\nu

so that

v^* = g(-,v) = g_{\mu,\nu} v^\lambda dx^\mu dx^\nu(e_\lambda) = g_{\mu,\nu} v^\nu dx^\mu = v_\mu dx^\mu

and

g^{-1} = g^{\mu,\nu} e_\mu \otimes e_\nu

so that

g^{-1}(-,v^*) = g^{\mu,\nu} v_\lambda e_\mu e_\nu(dx^\lambda) = g^{\mu,nu} v_\nu e_\mu = v^\mu e_\mu = v.

So I would say that, yes, absolutely g and g^{-1} are different tensors because they have different domains and the same codomain.

If you try

g(-,vstar)

in TensorAlgebra.jl, it will throw a domain error as it should.

No. And this is important. In TensorAlgebra.jl, tensors are never defined as maps i×j×k ← p×q. They are always defined as maps V\to K where K is the underlying field and V is a tensor space, so all those combinations are manifestions of the one single tensor, but we get all of the others for free via partial evaluation.

This feels like an unnecessary limitation and doesn’t allow you to model things like U\otimes V^*\otimes W, which in textbook notation an element would look like

u\otimes v^*\otimes w = u^\kappa v_\lambda w^\mu e_\kappa\otimes dx^\lambda\otimes e_\mu.

I see your later comment, where I think you acknowledge this :+1:

I’m not sure I understand. g already has both indices lowered with components given by g_{\mu,\nu} and g^{-1} has raised indices with components g^{\mu,\nu}.

The way to lower or raise any index is using g or g^{-1} respectively, but raising an index of g using g^{-1} doesn’t give a very interesting map V^*\times V\to K. It is the identity matrix :slight_smile:

Yes. Absolutely. This is already provided as an example :slight_smile:

This map is done via partial evaluation:

julia> g(-,v) in V^*
true

That :point_up_2: is actual working code.

I also consider myself a physicist (although I’ve worked in finance for more than 15 years now :slight_smile:)

BUT… I actually did mean dimensions in the other sense of the word but you do bring up a good point. I hadn’t thought about physical units, but that is another reason to distinguish position and velocity as elements of different vector spaces because they have different physical units even though they have the same vector space dimension :slight_smile:

Thanks for your great response @EricForgy

As you say in TensorAlgebras.jl, a vector is V^\ast \to K, or I would prefer to write this as K \to V. In categorical terms, the objects are vector spaces, but there is no such thing as elements of objects, objects are abstract things, not necessarily sets with elements. There are however morphisms between vector spaces, and so a vector v \in V is rather thought of as a morphism from the ground field K (one-dimensional vector space) to the space V, namely, this morphism would be the map that sends a scalar \lambda to \lambda v. As soon as you have several vector spaces involved, they could appear at different places in the domain or codomain, but I have chosen the convention to call a tensor that thing that you get from taking tensor products of vectors, i.e. the tensor product of morphisms K \to V, which is a morphism K \to V_1 \otimes V_2 \otimes \cdots \otimes V_N.

That’s indeed opposite to the picture where you think of a tensor as a multilinear map which receives some vector inputs and produces a scalar. That’s not how I need to think of tensors in my daily life, though it’s an equally valid perspective.

They are only naturally isomorphic for your typical vector spaces. Mostly, the problem is with reordering spaces rather than with applying the unit or counit, but you need this, for example, to go from K \to U \otimes V \otimes W to V^\ast \to U \otimes W: you cannot just use the (co)unit for this, as this only works on the first or last tensor factor.

Now, to permute two of the spaces, i.e. tensor indices, in the case of anyonic systems (or thus, general fusion categories), there might not be a swap operation that is symmetric (i.e. that becomes the identity if you do it twice), but rather you have to use some general braid. This actually changes the tensor. And doing it a second time does not bring you back to the same tensor.

But more down to earth, in my example of SU2 symmetry, I am not talking about being Lie-algebra valued or anything, just normal tensors which have some kind of constraint. Say I have a tensor in \mathbb{C}^2 \otimes \mathbb{C}^2 \otimes \mathbb{C}^2 \otimes \mathbb{C}^2, where on every \mathbb{C}^2 there is the fundamental representation of \mathsf{SU}(2) acting. I now want to describe a tensor in this space that is invariant under this joint action of the \mathsf{SU}(2) symmetry. This comes up typically in quantum mechanics, i.e. its a singlet of four spin-1/2 states.

The way such invariant tensors are described in TensorKit is using fusion trees, that is, we write the coefficients of this tensor with respect to a basis consisting of all possibilities of the following:

fuse the first two spin-1/2 representations, into a j1=0 or j1=1 representation
fuse the result of this with the third spin-1/2, which can be j2 = 1/2 if j1 = 0, or j2 = 1/2 or 3/2 if j1 = 1
and so forth …

If you now want to reorder the tensor factors, i.e. the spins, this changes the basis and has non-trivial effect on the corresponding coefficients. So even though in that case the actual tensors are maybe trivially related (naturally isomorphic), it is useful to be explicit about this, as it is not a zero-cost operation.

Hi @juthohaegeman,

Likewise, thank you for having patience with me and for your helpful comments :raised_hands:

Cool. I suspected you were doing something like this, but I couldn’t quite get my head around it :slight_smile:

From my days as court jester on the n-Category Cafe and nLab, I know how this works with Set with objects being sets and morphisms being functions. We know a set has elements so we handle this by identifying elements s\in S as maps from a one-element set, i.e.

s: \{\bullet \}\to S

but I struggled a bit to translate this concept to Vect. What is a one-element vector space? The only thing I could think of was 0\in K, but then there is only one morphism

0: 0\to V

since it needs to be linear, i.e. 0 maps to 0 in vector space so that can’t be right :thinking:

Sometimes I’m a “little” slow :sweat_smile:

I see what you mean now, but it doesn’t feel 100% satisfying :thinking:

Sure. K is a vector space, but a morphism

m: K\to V

doesn’t give you a vector v \in V. It gives you a line passing through 0\in V and v\in V with

1 \mapsto v.

On the other hand,

v: V^*\to K

is truly a single vector represented as a morphism.

Another issue with K\to U\otimes V\otimes W is that you cannot support partial evaluation, which then forces you into a situation where you do need a different TensorMap for each incantation of a tensor as you’ve implemented.

So given the choice of defining a tensor as map to a “line”

m: K\to V

in V versus defining it as map to a single element

v: V^*\to K,

in K, the latter feels more natural to me especially when you consider the benefits of partial evaluation.

BUT… nevertheless, regardless of how you define your type, either way can support the other operations that you mention.

This sounds cool :blush:

I’ll read through https://jutho.github.io/TensorKit.jl/latest/man/sectors/, but is there any other publication / article that describes the way you use TensorKit.jl? That might motivate me to try to implement similar functionality in TensorAlgebra.jl so we can compare notes further :slight_smile:

Braiding is something I want to look into as well so we can implement things like skew-symmetric differential forms, wedge product, etc.

Thank you again for sharing your thoughts and for your patience :blush:

Thinking out loud about the “one-element” thing again, would it make sense to deifne a tensor as a morphism

\tau: V\to 0

? Linearity works out because

\tau(a v_1 + b v_2) = a \tau(v_1) + b \tau(v_2) = 0.

A feel a commuting diagram somewhere here :smile:

I don’t quite see the difference. I thought we agreed that, with the duality structure (a.k.a. exact pairing, a.k.a. unit and counit), the vector space of morphisms V^\ast \to K is isomorphic to the vector space of morphisms K \to V. Either you only think about morphisms and never about “elements” of the corresponding spaces/objects, which is indeed a bit abstract, or if you want to consider elements of the corresponding spaces, then K has a distinguished element, namely the element 1, and the vector v \in V associated with the morphism K \to V is the image of 1.

I don’t think there is such a thing as a one-element vector space, i.e. singleton {0} is not typically not considered a vector space. There is a zero-dimensional vector space but that is empty, though I guess singleton {0} could be identified with that. However, that interpretation surely does not fit with how I think of vectors/tensors and how they are in TensorKit.jl. A the set/space of morphisms from V or V^\ast to the empty space is itself empty. If you would just write it as vectors and matrices using Julia’s Array:
an element from the empty vector space: v = Vector{Float64}(undef, 0) => length(v) == 0
a corresponding map from some 10-dimensional vector space to the empty vector space m = Matrix{Float64}(undef, 0, 10) => length(m) == 0, i.e. that space has dimension zero and thus no independent components. In contrast a vector v = Vector{Float64}(undef, 10) can trivially be identified with a m = Matrix{Float64}(undef, (10,1)), that is, a map from the one-dimensional vector space that can be identified with K to the 10-dimensional vector space V which contains v. So from that perspective, I think the identification of vectors with maps K\to V is very natural, it’s the one that e.g. MATLAB does automatically.

Hi Jutho :wave:

I don’t disagree. I am just still in catchup mode :sweat_smile:

Reading about tensor Hom adjunction gave me comfort in saying the same tensor can be represented as maps

  • U\otimes V\to K
  • U\times V\to K
  • U\to V^*
  • V\to U^*

I haven’t made the mental leap yet to see that

K\to U\otimes V

is also the “same”.

So thinking about this some more (for my own benefit),

\gamma_v: K\to V

with

k\mapsto kv

for v\in V represents a parameterized line through 0\in V and v\in V with

v = \gamma_v(1).

Another way to look at it is this parameterized line has a tangent vector

v = \left.\frac{d}{dk} \gamma_v\right|_{k=0}.

That inspires yet another way to look at it, i.e. we can consider k\in K as a vector and then

v = (\gamma_v)_* k,

i.e. v\in V is the push-forward of k\in K via \gamma_v.

Ok! So maybe that was obvious, but working through it helped me :sweat_smile:

I can now see how

v: K\to V

is effectively the same as

v: V^* \to K.

I don’t see the relation to unit and counit. I’d love to learn, but that is not so important I guess.

The question in my mind is, “Which is more natural and which leads to a more natural Julia implementation?”

My gut tells me that V^*\to K would be more natural, but it would be good if we could make some comparisons. TensorKit.jl provides one side of the comparison. I’d be interested in implementing the other side so is there a particular article / paper somewhere that illustrates how you use TensorKit.jl in practice? Maybe I’ll try reproducing some results some time :slight_smile:

I agree it is a little abstract to consider tensors as special morphisms, but we are both guilty of that :sweat_smile:

Thinking about “elements” is fun (to me) in the context of categories. In a category, you have objects and morphisms only. Objects often have elements, but that is not built into the definition of a category so elements are kind of derived concepts.

As I mentioned, a nice way to deal with this in Set is to identify an element s\in S with a morphism from a one-element set \{\bullet\} to S, i.e.

s: \{\bullet\}\to S

I was on the wrong path when I tried to look for a one-element vector space, which I thought was the analogue. Instead, we should let the tau of category theory guide us :yin_yang:

If we consider a functor

F: Set\to Vect_K,

where F(S) is the vector space spanned by the elements of S, then

F(\{\bullet\}) = K

and the correct analogy to

s:\{\bullet\}\to S

for elements of S is

v: K\to V

which are elements of V as you’ve been saying all along :smiley:

As I mentioned, I am a “little” slow sometimes :sweat_smile: but these kinds of pendantic exercises help me. So now that I am 100% fine with v: K\to V, I feel like questioning how fine I am with v: V^*\to K :thinking:

Both are fine and both will lead to different Julia implementations.

[Edit: I’ll add some references I come across here (mostly for myself :nerd_face:):

]

They’re conjugate linear: there is a g^{ij} such that aⁱ = g^{ij} a_j*, but not such that aⁱ = g^{ij} a_j. So you can’t use the index raising and lowering notation that they use in relativity; you would also have to keep track of which factors were conjugated, at least.

The Hamiltonian operator can be (and often is) treated as a mapping from ket×bra → scalar, or from bra → bra or ket → ket, but the corresponding mapping from ket×ket → scalar is almost never mentioned. My impression is that relativity calculations use that identity all the time, but I haven’t done enough relativity to know for sure.

The metric in the complex plane is the complex conjugate. That is, to use Mathematician’s notation,

|z|^2 \equiv g(z, z) = z^*z

Hence, complex manifolds need to track a complex conjugation as well as more familiar manifestations of the metric when they’re formulated. When you track this, the metric does act linearly in Hilbert space, but it is awkward to express with index notation as g_{ij} a^{j} rather than g(\_, a)


Edit: replied to wrong post

Ok, I understand. I don’t typically associate nonlinear with conjugate linear (often called antilinear), hence my confusion

I think a proper description in the most general (i.e. complex) case starts from four type of indices. upper, barred upper, lower, and barred lower. A complex metric is indeed of the form g_{\bar{\imath},j}, and can be used to lower a vector with a barred upper index w^{\bar{\imath}} to a dual vector/covector \tilde{w}_j = w^{\bar{\imath}} g_{\bar{\imath},j}. That operation itself is purely linear. However, to get a vector with a barred upper index in the first place, you indeed need to e.g. conjugate a regular vector with an unbarred upper index: \mathrm{conj}: v \to w such that w^{\bar{\imath}} = \mathrm{conj}(v^{\imath}).

In a Euclidean complex space, where the metric is trivial, i.e. g_{\bar{\imath},j} = \delta_{\bar{\imath},j}, you can equate barred upper indices with lower indices, i.e. you can equate the dual vector space with the conjugate vector space. But indeed, this still implies that you conjugate a vector to get a dual vector, so it is an antilinear operation.

TensorKit.jl includes a GeneralSpace vector space that supports the four different flavors of indices, though most of the rest of the package is built around the case of Euclidean spaces, where just two flavors of indices (dual or conjugate, they are the same) is sufficient.

Note by the way, that in differential geometric terms, a Riemannian metric is defined as a symmetric bilinear for real vector spaces. Starting from a complex vector space (e.g. the tangent space of a complex manifold) with a hermitian metric, the real part of it forms a Riemannian metric, whereas the imaginary part defines a symplectic form. The compatibility between both takes you in the domain of Kahler manifolds etc.

3 Likes