Tensor regression models in Julia

outlace · June 12, 2018, 5:55am

I’m trying to create a tensor regression model in Julia. I first tried in PyTorch but it was becoming a hassle so thought I’d give Julia a try, and its looking like Julia can’t do it at all (at least in any efficient way).

A tensor regression is just a series of tensor contractions (“generalized matrix multiplications”), where the tensors (multi-dimensional arrays) act as trainable parameters. I want to train these parameter tensors using gradient descent as I would with a normal neural network (which is a series of matrix-matrix multiplications with interspersed non-linear functions). The problem is TensorOperations.jl, the major library that supports tensor contractions, does not work with either of Julia’s main ML libraries, Knet.jl and Flux.jl. Both of the latter libraries secretly convert your Array types into special trackable types so they can keep track of gradients, but then TensorOperations.jl can’t handle those types.

I tried Einsum.jl and it technically works with Flux.jl, but only with outrageously bad performance. It took literally 2 minutes to do a very small tensor contraction with Flux tracked arrays (takes 1.3 seconds with normal arrays).

I’m picking up Julia again after trying it back at v0.3 so I may just be missing something here. Any help appreciated.

improbable22 · June 12, 2018, 9:19am

How complicated are the contractions you need? If you can re-write them with reshape and permutedims then Flux (and probably other options) should work well. Here’s A_{i,j,k} B_k = C_{ij}:

using Flux
A = param(rand(2,2,7))
B = param(rand(7))
reshape(A, 4,7) * B  |>  C->reshape(C, 2,2)

improbable22 · June 12, 2018, 9:50am

After reading the readme: TensorOperations seems to reduce everything to three functions add!, trace!, contract!.

It seems like it would not be very hard to write fallback versions of these functions in terms of reshape etc. as above. Or in fact to just work out a gradient for each of these, and then provide this to Flux.back.

outlace · June 12, 2018, 5:33pm

I’m trying to do this contraction:

C_{s,k} = X_{s,h}A_{h,i,j}B_{i,j,k}

where X is some input tensor of data and the tensors A and B are the trainable parameter tensors. In particular I’m trying to do a tensor regression with MNIST data, so X is [batch size, 784] and the result is [batch size, 10] for each of digit classes.

I’ll try what you suggested.

jessebett · June 12, 2018, 6:15pm

I agree with @improbable22 that these gradients should definitely be added as a method to Flux.back.

LaurentPlagne · June 12, 2018, 6:21pm

Slightly off topic but it reminds me that I wonder if the work related in this article could be efficiently developed in Julia…

Topic		Replies	Views
Tensor contractions very inefficient in julia Performance tensors	13	2639	December 18, 2020
Tensor operation in Julia New to Julia question , tensoroperations	3	896	May 20, 2023
The same network performs differently in Flux.jl and tensorflow Machine Learning performance	13	3122	December 18, 2019
TensorOperations and Array structures New to Julia tensors	4	2251	May 19, 2018
Improving performance of tensor contractions Performance question , blas , tensors	7	1696	June 14, 2019

Tensor regression models in Julia

Related topics