ANN: Tensars.jl: Tensors as linear mappings of multidimensional arrays

The Tensar package is the first software that I have publically released, so please be gentle. It is my attempt to solve the following problems:

  • I have a n×n matrix that discretises d/dx. I am trying to compute the curl of a 3D field, stored in an n×n×n array. I have read @doc kron a dozen times in the last hour, my whiteboard is covered in block matrix diagrams, and I still can’t get it right.

  • Tensors (in the mathematical sense) are a great idea. Einstein notation is a genius way to manipulate them on paper, but it always seems clumsy when implemented in a computer language.

  • The up and down tuples from Scmlib were a really promising idea, and no one else has ported them to Julia yet.

  • In scientific computing, linear is a very important special case. Julia should have a standard way to declare which array-valued functions are linear, instead of DifferentialEquations having its own way, and every other project having its own different way.

This is research software for now, and the implementation is a proof of concept. There is no guarantee that it works for anything except the examples in the README.

I welcome any feedback, and I’d especially like to hear the following:

  • I’m re-inventing the wheel, and here’s a link to a tyre shop.

  • Scmlib had a clever and elegant way of doing X, which I have failed to understand and have replaced with an ugly kludge.

  • This would be useful for package X, but only if it could do Y.

  • There is a way to generalise this to tuples of different-sized arrays, without making it an order of magnitude more confusing.

13 Likes

This is an interesting simple idea. Have certainly run into things like your first example, what shape should the jacobian of an R^N -> R^M function be?

I wonder if these should interoperate with ordinary arrays more. Perhaps they should be <:AbstractArray{T,M+N}. And perhaps operations like A::Array * B::Tensar should try to interpret A… one rule would be A::Matrix is always (1,1), but another would be to eat up as many dimensions as B would need.

julia> o2345 = reshape(ones(2,3,4,5), (2,3), (4,5))
4×5 → 2×3 Tensar{Float64}

julia> o456 = reshape(ones(4,5,6), (4,5), (6,))
6-vector → 4×5 Tensar{Float64}

julia> o2345 * o456 
6-vector → 2×3 Tensar{Float64}

julia> size(ans)
((2, 3), (6,))

julia> o2345[2,3,4,5] # no getindex
ERROR: MethodError: no method matching getindex(::Tensar{Float64,2,2}

julia> o2345 * reshape(ones(5,6,7), (5,),(6,7)) # no partial contractions
ERROR: DimensionMismatch("column size 4×5 does not match row size 5-vector")

Finally, I think it would be less confusing to print the size as 2×3 ← 4×5!

2 Likes

This looks nice. The idea that a tensor is a linear map between —if I can interpret freely— a tensor product of vector spaces, is an idea I am also using in my package TensorKit.jl, which I still have to officially announce. I hope to do so soon (the most important remaining task is re-reading the documentation).

While my package is mostly aimed at quantum physics, it is intended to be useful more generally, so I’d be happy to learn more about your use case. My package is completely disconnected from AbstractArray and defines a new type hierarchy and functionality on top of it.

My package Grassmann.jl is also based on multi-linear algebra and can compute the curl using the Grassmann exterior product and Hodge star. It’s based on geometric algebra instead of matrices or multi-dimensional arrays.

It uses the AbstractTensors.jl type hierarchy, if you’re interested in checking that out. Tensor product of vector spaces are also a part of it, built on DirectSum.jl.

I heartily agree with this!

This is the same notation I’ve used in the past when working with multiple curved coordinate systems in geospatial. Without clear naming conventions, it’s all too easy to incorrectly chain transformations. I’ve found that when transform objects are named AfromB and BfromC, etc, it becomes manifestly clear when the code is correct:

composite = AfromB ∘ BfromC ∘ CfromD
y = AfromB(BfromC(CfromD(x)))

Compare to the following which just gives me a headache every time:

composite = BtoA ∘ CtoB ∘ DtoC
y = BtoA(CtoB(DtoC(x)))
2 Likes

I thought about the relation between Array and Tensar a lot. The README explains why I don’t think Tensar should be a subtype of AbstractArray. Roughly, the same Array corresponds to several Tensars that multiply differently. I think those casts should be explicit.

Finally, I think it would be less confusing to print the size as `2×3 ← 4×5` !

I considered that, but thought it looked too strange and only I would like it. Obviously I was wrong.

You can do partial contractions explicitly, with tensor product and trace. I haven’t thought about conventions for making them implicit broadcasts, but that sounds like a good idea.

1 Like

I guess there are various use-cases, and certainly demanding that everything be of your type will be tidier.

The Jacobians I had in mind need to exist in a world where most things aren’t going to have any special type, but nevertheless it could be useful to flag such objects and treat them specially. Right now they work like this:

julia> jac = ForwardDiff.jacobian(x -> sum(x).*ones(2,3), ones(4,5));

julia> size(jac) # just a matrix...
(6, 20)

julia> jac * vec(ones(4,5)) |> size # ... so that this works
(6,)

julia> vec(ones(2,3))' * jac |> size # ... and this.
(1, 20)

Perhaps the ideal thing for this would in fact be a pair of wrappers, TensorExpanded which has size (2,3,4,5) and a TensorCollapsed which has size (2*3,4*5), but both remember that they are 2×3←4×5 & can be trivially converted. Changing jacobian() to return a TensorCollapsed would then not break existing uses.

But this may be far from the use cases you had in mind!

1 Like

I understand the issue now. That sounds like a good way to sneak this in to library functions like jacobian, and I’ll implement it. This is exactly what I had in mind, but I wasn’t thinking so far ahead.

1 Like

Sadly, A*B*C is (effectively) parsed as (A*B)*C.

I like the idea of things that look like matrices, but remember their tensor shape when used in a tensor context.

I’ll think about how to handle the jac*ones(4,5) issue. That needs to throw an error, since jac is pretending to be a (6,20) matrix and should do so consistently. In the tensar formalism, both Tensar(jac)*ones(4,5) and jac*Tensar(ones(4,5)) could work. How does your code handle jac*vec(ones(20))?

Do you have example code for that?

Thanks for reminding me to actually do curl as an example of Tensars. I just realised that I’ll need to specify and implement hcat and vcat to do it. Also, what kind of tensor library lacks an exterior product?

I’ve never taken the step past exterior calculus. Could you recommend a short introduction that assumes I already know about Hodge duals and exterior derivatives, and gets straight to the point of how geometric algebra will make my life better?

Does your library have a function which takes a vector of partial derivative matrices, and hands back the exterior derivative operator? That would be cool.

Yes, the code is implemented by using the two algebraic operations based on the combination of the Hodge complement \star and the Grassmann exterior product \wedge combined with Leibnizian vector field \nabla. The formula used is \nabla \times \omega \equiv \star(\nabla\wedge\omega) = \star d\omega, where d\omega is the generalized exterior derivative operator. This also accepts higher order tensors \omega as input, for example versors.

I maintain an introduction and some references at https://grassmann.crucialflow.com/dev/algebra

1 Like

Great idea. DiffEqOperators can definitely be improved, and these are some nice ideas.

However,

The thing that your package is missing though is that the operators should all have a call A(u,p,t) and A(du,u,p,t) in order to be used freely as an alternative to A*u in a DiffEq context. They need update_coefficients! so that A*u can be a time and state dependent operator. Etc. All of the the tensor libraries out there really represent “constant tensors”, which doesn’t necessarily fit most of the things we need this for (for example, state-dependent mass matrices).

That said, a “compatible operator” is just a function that implements the right functions, so if those calls are added to an operator of a separate library, like yours, then it’ll be compatible with DiffEq.

Thanks for looking at this, Chris! I was curious to see what you think.

mul! is on the to-do list, along with a cast to DiffEqOperator. I am not sure that I correctly understand the purpose of update_coefficients!. Is that described somewhere I’ve missed?

Has anyone coded up that example, or at least written down the ODEs?

Yes, these are our tests for time u' = A(t)u and state dependent u' = A(u,p,t)u ODE solvers:

And here’s tests for time M(t)u'=f(u,p,t) and state M(u,p,t)u'=f(u,p,t) dependent mass matrices:

It’s because on the caller side we need to have a way to control a dependent matrix and change it to the time and state that we want. Most of these interfaces for tensors are only on the callee, so you probably know how to give me the other tensor, but in the interface I don’t have a nice way to query for how to update a tensor I have to what you would want to update it to. update_coefficients! is just a standard “I put in u,p,t and get the up-to-date one”.

2 Likes

I see. The primary requirement for DifferentialEquations is that the matrix for this step reuses the same storage as the previous million steps. Doing clever things with linear operators and matrix exponentials is a secondary goal. I thought it was the other way round.

No, it does both. The operator also supplies expv! overloads for https://github.com/SciML/ExponentialUtilities.jl and linear solver overloads as necessary. That’s just the easy stuff I’d expect out of an operator though, so I assume any operator would do those two parts. But the dependence part is a part that I don’t see any other interface have, and it’s why we have to have our own.

1 Like