Julia seems to have many different packages for tensor operations, like TensorFlow, TensorOperations, Tensors, TensorToolbox, and TensorWorkbench. Is there a general consensus as to which one(s) are the most useful? Could anyone provide a summary of each package’s features, or ideally a side-by-side comparison?
Tensors.jl are for computations with real tensors (Tensor - Wikipedia) and not for normal high dimensional arrays as the ML community has started referring to as tensors, (why??).
To be clear, TensorFlow has basically nothing to do with tensors.
It is a ML Framework (with surprising amounts of Linear Algebra and general programming, see Mike’s Tenth Law: “Any sufficiently comprehensive machine learning framework includes an incomplete and informally specified implementation of julia”; infact the parts of numerical linear algebra is supports the least are tensor operations like einsum
)
Further the Tensor type in TensorFlow.jl doesn’t actually even represent a numerical value, or a collection, at all.
It represents a node in a computation graph that can later be executed.
And executing said node might return a 3D array, 2D array, and 1D array, an Integer, or even a String.
Probably because “High Dimensional ArrayFlow” didn’t quite have the right snappiness to it. It’s not the only term with a diluted meaning in machine learning. My favorite is deconvolution, which is supposed to be the inverse operation of a convolution, but in ML is a normal convolution applied to a zero-embedding of the input.
(To be completely fair, the dilution of tensors has a previous history within Computer Vision.)
TensorOperations is, as the name suggests, to perform tensor operations (mainly tensor contraction) with Julia’s built in multidimensional arrays. It is not completely generic, but assumes a strided memory layout of the data. The goal is to have it work with any AbstractArray
with strided memory layout, but since such an interface is not completely well established and up for changes, it currenly only works with the ones in Base. Also, TensorOperations is aimed at large tensors (containing many entries).
Tensors.jl are for computations with real tensors
Is this real as in “real valued”, not complex? I find this name rather generic for a rather strict definition of tensors allowed by Tensors.jl
. In general, a tensor is an element from the tensor product of vector spaces, possibly also involving dual spaces etc (which are the same in a real euclidean (i.e. cartesian) vector space). As far as I see (and forgive any mistakes), Tensors.jl
can only represent elements of the tensor product of at most 4 real vector spaces, where furthermore the different vector spaces in the tensor product have to be the same and the dimension of the vector space is restricted to 1, 2 or 3. I certainly agree that this is a very common and useful type of tensors (i.e. objects that transform covariantly under the rotation group) in various branches of mechanics and classical physics (though no support for tensors from relativity, objects that transform under the Lorentz group?), and that it is good to have an optimized, stack allocated implementation thereof. But I think a slightly more specific name like CartesianTensors
would have been more appropriate (Cartesian tensor - Wikipedia).
My TensorToolbox.jl
takes “a tensor is an element from the tensor product of vector spaces” as starting point, but is currently abandoned. However, I have a fully functional and even more general/powerful version up and running in a privote repository, and hope to release it in the next few months (after making it fully and probalby only compatible with julia 0.7). I plan to release it under a different name though, hopefully TensorKit.jl
if that does not get taken by then, since there is also a different (and I believe registered) package called `TensorToolbox.jl’ (GitHub - lanaperisa/TensorToolbox.jl: Julia package for tensors as multidimensional arrays, with functionalty within Tucker format, Kruskal (CP) format, Hierarchical Tucker format and Tensor Train format.) that mimicks a corresponding matlab package with the same name.
I think the ML community has just adopted the mathematicians’ definition of tensor, which is just “high-dimensional array”, though usually with some notion of underlying vector space for each dimension. In contrast, a physicist’s tensor is a multi-index object that transforms in a certain way under certain operations (e.g. rotations in 3D space), with a possible distinction of contravariant and covariant indices (especially in general relativity AFAIK). In practice, the mathematicians’ tensors are quite large, while the physicists’ tensors are quite small (please correct me if I’m wrong).
I’m also working on a rather specialized tensor package (it is too WIP to link to) and it’s also for “just” high-dimensional arrays. I think the majority of Julia tensor packages uses the term in this sense?
There is also a JuliaTensors Github organization but it looks pretty bare and might be abandoned. If there is general interest in tensors, maybe we could revive or expand on this? Although again, one would first need to clarify what kind of tensors…
I think one area of physics that does seem to refer to tensors as just high dim operators is the area of tensor network states
That is not the math definition of a tensor…
@kristoffer.carlsson Could you clarify exactly how Tensors.jl works with “real tensors”? First of all, I assume you mean “actual tensors” as opposed to tensors over the field of real numbers? Second, What exactly can your package do with them? The formal definition of a tensor is rather abstract and it’s not obvious how any numerical implementation would avoid treating them as high-dimensional arrays.
FWIW my comment about the ML usage of tensors was mostly in jest
The docs show most of the functionality.
A tensor is an object that transforms under a direct product of group operations. It’s just a nice way of reducing complicated group representations to really simple ones. For example, the electric field strength tensor F_{\mu\nu} lives in a 6-dimensional representation of the Lorentz group, but I know that since it has two indices I can transform it thus:
F_{\mu\nu} \to {\Lambda_{\alpha}}^{\mu}{\Lambda_{\beta}}^{\nu}F_{\mu\nu}
This is especially useful with complicated objects such as the Riemann tensor R_{\mu\nu\alpha\beta} or Weyl tensor C_{\mu\nu\alpha\beta} which are in actuality in some complicated representation of the diffeomorphism group, but you can simply just think of as some vectors mashed together with some simple symmetry properties.
Tensor networks are a little bit of a special case. They have some special properties because they represent quantum statues and these are required to have certain transformation properties. You can do all sorts of nice things with them some of which are reminiscent of general arrays, but technically they are still tensors (or rather their components are).
It’s kind of unfortunate that the ML community has so thoroughly hijacked this nice term, especially when they could have just said “array”.
Okay, so it seems like the summary is:
-
The C++
iTensor
library may get ported to Julia soon. -
The
JuliaTensors
GitHub organization appears to be abandoned. -
TensorFlow
doesn’t have anything to do with tensors in the standard CS sense of the word, but instead is a ML framework. -
TensorKit
(if that name is still available) will hopefully replace Jutho’s abandoned packageTensorToolbox
sometime in the next few months. It will provide a new Tensor type that is completely independent of Julia’s AbstractArray hierarchy. -
TensorOperations
does tensor operations (mainly contractions) on large Base arrays of arbitrary dimensions. -
Tensors
can only handle tensors of rank 1, 2, or 4, in which every index ranges over the same dimension 1, 2, or 3. But it can efficiently handle symmetric and antisymmetric tensors and automatic differentiation. -
Jutho’s
TensorToolbox
is currently abandoned. -
lanaperisa’s
TensorToolbox
reproduces the functionality of Matlab’s Tensor toolbox, but does not have a built-in implementation of tensor contraction. -
TensorWorkbench
andTensorBase
have not been updated in several years and appear to be abandoned. -
Xtensor
implments Julia bindings for the C++xtensor
library.
TensorKit
will still be very similar to what is described in the Readme of TensorToolbox.jl
. In summary, it provides a new Tensor
type that is completely independent of Julia’s AbstractArray
hierarchy. The properties of this Tensor
type I summarize below, but let me start by saying that they can still be operated upon with the Einstein index notation using the @tensor
macro environment from TensorOperations.jl
.
A tensor is an object that transforms under a direct product of group operations.
@ExpandingMan, without trying to be pedantic, I almost agree with this definition, except that it is a tensor product/direct product of representations of a given group element. A direct product of group operations implies the direct product of the groups, i.e. as if you would act with one group operation on one tensor index and with another group operation (possibly from another group) on another tensor index. That is not what is going on. In the use case of Tensors.jl
, the vector space is R^2 or R^3, which carries the fundamental representation of the rotation group O(2) or O(3), and a tensor is an object that transforms under the tensor product of these representations, i.e. a rank 2 tensor under a rotation O transforms as T[i,j] -> O[i,i']*O[j,j'] T[i',j']
, often denoted as T-> O T O.'
. But clearly, the rotation matrices on the two indices are the same, they are not two independent rotation matrices.
However, this is too strict. Both rotation matrices should represent the same group element, i.e. the same physical rotation, but they do not necessarily need to be in the same representation (e.g. the fundamental or defining representation of O(3) in the example above). More generally, a tensor really just is the an element from the tensor product of vector spaces, and that is also the starting point of TensorKit
. So TensorKit
starts by defining a type hierarchy for representing (finite-dimensional) vector spaces. Yes, the main property is their dimension (i.e. the size
if you want to compare with arrays), and this can be different for the different tensor indices. But vector spaces do have additional structure / properties. A typical vector space has an associated dual space, and so a tensor can have covariant (lower) and contravariant (upper) indices. In fact, in general complex spaces, you might have four different types of indices (typically dotted and undotted lower and upper indices), corresponding to a vector space, its conjugate, its dual, or the dual of its conjugate. This corresponds to four different representations associated to the general linear group on a vector space. However, in case of a real vector space, dotted (conjugate) and undotted are the same, and in the case of a Euclidean inner product (unitary instead of general linear group), dotted upper (conjugate) equals lower (dual) and vice versa. So in a real Euclidean (i.e. Cartesian) space (orthogonal group), there is only one type of index. That is the case where plain arrays are sufficient to represent the mathematical structure.
TensorKit aims to represent the most general case, but, because of my background (tensor networks and quantum physics), most of the additional functionality is specialized to (complex or real) Euclidean spaces. For example, factorizations like qr or singular value decomposition only make sense, even in the case of matrices, in the context of a Euclidean inner product. Surely, they can be generalized once a custom inner product is defined, and that’s something that maybe one day will be in TensorKit.jl, but currently this is not the case.
Another example is symmetries. Physical systems often have symmetries which act according to a certain (not necessarily irreducible) unitary representation on these Euclidean spaces, and one is often interested in tensors which are invariant under the total action of this symmetry on the tensor. For example, in the case of a rank 2 cartesian tensor, for T-> O T O.'
to be invariant, T
needs to be proportional to the identity and it really has only one component. TensorKit.jl
can represent tensors with an arbitrary number of indices, where on every index a certain symmetry group (abelian, non-abelian, direct product of different groups, …) acts according to a given representation (not necessarily an irreducible one) and automatically determine the components of the tensor which are invariant, and only store those and operate on those, provided some crucial details regarding the representation theory of the group is specified. For example, are you interested in representing a rank 4 (instead of rank 2 in the example above) tensor on ℝ^3 ⊗ ℝ^3 ⊗ ℝ^3 ⊗ ℝ^3
that is invariant under rotations? TensorKit.jl will automatically tell you that it only has 3 independent components.
Extending beyond this, tensor can also be fermionic, meaning that they are defined on tensor products of graded vector spaces, and behave non-trivially under index permutations. (One could further extend this to tensors where one should not permute but braid indices and pick up non-trivial complex phases etc, but surely this is not implemented yet). The full details will follow, and are most naturally described using some ingredients from category theory.
Hats off for this remarkable explanation/summary. Also I didn’t know about TensorKit
this sounds a super useful package for doing a lot of things in physics! I will definitely check it out
Mathematicians would describe a tensor as either a covariant or contravariant multi-linear map to a field from either the cartesian product of a vector space or its dual. These tensors can have additional special properties, such as being antisymmetric or symmetric. For example, the det
method (determinant) is a functor that takes anti-symmetric covariant tensors and maps them to contravariant tensors.
Therefore, as far as mathematical definitions goes, you can already work with Tensors using only Base
. if you are working with determinant from base julia, you can already work with tensors without extra packages.
Except when they work in bundles / manifolds, where e.g. curvature R_{ijk}^l is a tensor, while the connection \Gamma_{ij}^k is not a tensor, but notation treats it like a tensor; and a scalar is not just a number, it is an invariant under $Group. This is a giant can of worms, and since mathematicians and physicists can’t get their own notation straight it is kinda unfair to fault computer scientists for appropriating words.
(but a package that makes for easy computation of geometric quantities, via AD, would be quite awesome)
I do have a package planned for that, it is called Multivectors.jl
but it is empty as of now, but I’ve got some books and resources available to help me get started, this discussion reminded me to return to it.
Just a rumor: the author of iTensor is writing a julia version! You will soon (maybe just this year) have something really tensor XD
There is also Xtensor.jl, which uses the xtensor C++ library.
Just to be clear about this, while a tensor certainly can be defined as an object that transforms under certain operations etc, an equivalent definition is to define it as just a high-dimensional function that is multi-linear (linear in each of its arguments). [A good reference for connecting these different definitions is the book “An Introduction to Tensors…” by N. Jevanjee.] Then, if one works in a certain finite basis, the tensor is one-to-one equivalent to a multi-dimensional array.
So barring certain circumstances, such as infinite-dimensional spaces, it’s usually not wrong at all to equate a tensor with a multi-dimensional array as long as it’s clear what basis one is working in. In physics, and even in many applied math applications, it is often totally clear what the basis is. So then it’s not really wrong or misleading to use tensor as a shorthand for multi-dimensional array, since these are equivalent in such a setting and context.