Problem
I am trying to implemented majorization-minimization (MM) algorithm to solve the NMF problem
The optimal \mathbf{V} and \mathbf{W} are found using iterative updates
with termination criterion
The implementation is fairly easy and it could indeed converge but turns out to be extremely slow and memory-consuming for a moderate scale problem.
Note, in the following, V
and W
sent into the function are initial guesses and X
is the matrix we are trying to approximate using MM algorithm. r
is the hyperparameter used to control the granularity of approximation.
using LinearAlgebra
function nnmf(X::Matrix{T},
r::Integer;
maxiter::Integer=1000,
tol::Number=1e-4,
V::Matrix{T}=rand(T, size(X, 1), r),
W::Matrix{T}=rand(T, r, size(X, 2))) where T <: AbstractFloat
L = 0
row, col = size(X)
for iter in 1:maxiter
# Step 1
b = V * W
# Step 2
V_new = copy(V)
for k in 1:r
for i in 1:row
weight = (X[i, :]' * W[k, :]) / (b[i, :]' * W[k, :])
V_new[i, k] *= weight
end
end
# Step 3
b_new = V_new * W
# Step 4
W_new = copy(W)
for j in 1:col
for k in 1:r
weight = (X[:, j]' * V_new[:, k]) / (b_new[:, j]' * V_new[:, k])
W_new[k, j] *= weight
end
end
# update V, W
V, W = V_new, W_new;
L_new = norm(X - V * W)^2
rel_diff = abs(L_new - L) / (abs(L) + 1)
if rel_diff <= tol
break
end
# update L
L = L_new
println("Iteration: $iter, Relative Difference: $rel_diff")
end
return V, W
end
I am pretty new to programming in Julia, could someone if anything above could be improved?
Edit
Thank you for all of your advice, I followed them and get significant performance increase in terms of time (from 150+s to about 7s). At the same time, the code becomes elegant when as much as matrix multiplication is use. However, the memory allocation is still concerning (8.25GB).
I read the performance tips section in official doc. It seems that pre-allocating memory could help. But I am not sure how to do.
Could anyone provide some pointers? Thank you in advance.
function nnmf(X::Matrix{T},
r::Integer;
maxiter::Integer=1000,
tol::Number=1e-4,
V::Matrix{T}=rand(T, size(X, 1), r),
W::Matrix{T}=rand(T, r, size(X, 2))) where T <: AbstractFloat
L = 0
row, col = size(X)
V_new = zeros(eltype(V), row, r)
W_new = zeros(eltype(W), r, col)
for iter in 1:maxiter
# Step 0: evaluate last step's result
res = V * W
L_new = norm(X - res)^2
rel_diff = abs(L_new - L) / (abs(L) + 1)
if rel_diff <= tol
break
end
# update L
L = L_new
# Step 1
copyto!(V_new, V)
copyto!(V_new, V_new .* ((X * W') ./ (res * W')))
# Step 2
copyto!(W_new, W)
copyto!(W_new, W_new .* ((X' * V_new) ./ (W' * V_new' * V_new))')
# Step 3
# update V, W
copyto!(V, V_new)
copyto!(W, W_new)
end
return V, W
end