We have four tensors A, B, C, D of the same dimension (m * n * p, e.g. 1800 * 25 * 25). We need to find coefficients x1, x2, x3 such that:
x1 * A - x2 * B - x3 * C ≈ D
given that x1 + x2 + x3 = 1 and
0<= x1,x2,x3 <=1
Is there any efficient way to find the coefficients x1, x2, x3 such the the combination is approximately equal to D? By equal to we mean that each element of the resultant tensor on the LHS of the equation is almost equal to each element of the tensor D on the right. Also is it possible to generalize to more than 3 weights, such as if we had close to 50 tensors on the LHS and we are trying to do a weighted combination of them?
Seems like a convex optimization problem, you could take the tutorial example from Convex.jl and modify it slightly to run on your data by stacking flattened versions of your arrays as columns in a new array.
Exactly how you define approximately equal to will matter when choosing the cost function, so sumsquares would then be replaced by what you want.
using Convex, SCS
ABC = [A[:] B[:] C[:]]
b = D[:]
x = Variable(3)
problem = minimize(sumsquares(ABC * x - b), [x >= 0, x <= 1, sum(x) == 1])
solve!(problem, SCS.Optimizer)
You have to make more precise what what you mean by “approximately equal”. Typically, you would minimize \Vert x_1 A - x_2 B - x_3 C - D \Vert in some norm \Vert \cdots \Vert, but which norm do yo want?
@albheim’s solution suggested the L_2 (Euclidean) “vectorized” (Frobenius) norm, which makes this a constrained least-squares problem. (If you let x_3 = 1 - x_1 - x_2 to eliminate the equality constraint, then it is bound-constrained least-squares; see also this thread: Suggestions needed for bound-constrained least squares solver). If you use the L_1 or L_\infty vectorized norms, then it is equivalent to an LP (linear programming) problem. In any of these cases it is convex, and so it can be solved by lots of algorithms (e.g. via Convex.jl, JuMP.jl, StructuredOptimization.jl, …). With only 3 unknowns (or 2, if you eliminate x_3) and 10^6 equations, it’s not a particularly big problem, so lots of algorithms should be fine.
Hi @albheim , thanks for the response. By equal to we mean that each element of the resultant tensor on the LHS of the equation is almost equal to each element of the tensor D on the right. So I’m trying to find a combination of weights such that the weighted sum on the LHS is elementwise similar to the tensor D on the RHS.
Hi @stevengj , thanks for the response. By equal to we mean that each element of the resultant tensor on the LHS of the equation is almost equal to each element of the tensor D on the right. So I’m trying to find a combination of weights such that the weighted sum on the LHS is elementwise similar to the tensor D on the RHS. Can you suggest a specific norm for the purpose?
Also is it possible to generalize to more than 3 weights, such as if we had close to 50 tensors on the LHS and we are trying to do a weighted combination of them?
Thanks!
In general there may be no solution in which the LHS and RHS are “almost equal”. There will be some errors, which may or may not be small. The question, how do you want the errors to be distributed?
Do you want them to be as small as possible “on average”? Then probably minimize the L^2 (sum-of-squares) error. This gives you a least-squares (QP) problem.
Do you want the biggest individual error to be as small as possible? Then probably minimize the L^\infty (maximum norm). This gives you an LP.
If you’re not sure, I would just try the L^2 norm — least-squares minimization is the most common choice and has the easiest algorithms.
Sure. As @albheim suggested, I would start by “vectorizing” your tensors into column vectors, since the tensor shape seems to be irrelevant to your problem. Let d = vec(D), and let W be the matrix whose columns are the “vectorized” versions of all of the tensors on the LHS, and flip signs so that all of the terms on the LHS are + (no -). Suppose you have n such tensors, so that you have n weights x and W has n columns. Then you are solving:
\min_x \Vert d - Wx \Vert \\
\mbox{subject to }\sum_k x_k = 1, \; 0 \le x_j \le 1
which is a convex problem, a constrained least-squares problem in the L2 norm. If you eliminate x_n by setting x_n = 1 - \sum_{k=1}^{n-1} x_k, then it simplifies to a box-constrained problem:
\min_\tilde{x} \Vert \tilde{d} - \tilde{W}\tilde{x} \Vert \\
\mbox{subject to }0 \le x_j \le 1\mbox{ for }j=0,\ldots,n-1
where \tilde{x} is the first n-1 rows of x, \tilde{W} is the first n-1 columns of W minus the last column of W, and \tilde{d} is d minus the last column of W. In the L2 norm, there are lots of approaches to box-constrained least-squares, some of which are linked in the thread I mentioned above.
(Note that to match the original problem, this should be ABC = [A[:] -B[:] -C[:]]. But I agree that it is more natural to redefine the terms on the LHS to make all of the coefficients +.)