I’m trying to solve a matrix factorization problem with alternating minimization. My data matrix has missing entries and because of that I can not simply write my objective function as:

```
# A is a n x m data matrix
n, m = size(A)
k = 2
X = Variable(n, k)
Y = Variable(k, m)
problem = minimize(sumsquares(A - X*Y), (X*Y)>=0)
```

I have come up with two ways to address this issue, but one is very slow and uses to much memory, while the other is giving `NaN`

results in a lot of experiments. I’m going to describe both procedures for writing the objective function below:

##### Approach 1

```
objective = 0
for o in observations
objective += square(A[o...] - (X*Y)[o...])
end
```

where `observations`

is an Array of tuples, with each tuple containing the indices of an entry in matrix `A`

that was indeed observed, i.e., non-missing.

This approach uses all the RAM in my computer and is really slow when calling `problem = minimize(objective, constraints)`

and every time `solve!(problem, SCSSolver())`

is called.

##### Approach 2

```
B = coalesce.(A, 0.0) # replace missing entries with 0.0 for .* to work
objective = sumsquares(M .* B - M .* (X*Y))
```

where `M`

is a `BitArray{2}`

, which `1`

s representing observed values and `0`

s the missing ones.

This approach is faster, I suppose because the DCP problem tree has a significant less branches. The issue is that running my program using this approach, using the same data and `random.seed`

, often gives me a Y matrix of `NaN`

s or an overall bad approximation.

Does anyone know a better way of writing this problem using `Convex.jl`

?