Problem
I am trying to solve a large scale (about 1 million webpages) PageRank problem with iterative solvers. The problem could be cast into solving the linear system
where \mathbf{A} is the connectivity matrix that is specified as
The correctness of this formulation has been validated in some toy examples.
However, when trying to solve this with eigen solver provided in Arpack.jl
, the process
- Extremely slow when number of Krylov vectors is large
- Could never converge when number of Krylov vectors is small or
maxiter
is small
The major bottleneck should arise from linear mapping function I wrote, especially the fact that I have to use a lot of global variables. However, I do not know how to resolve the issues in my code.
using LinearMaps, LinearAlgebra, SparseArrays, MatrixDepot, Arpack
# loading A
md = mdopen("SNAP/web-Google")
A = md.A
# global variables
p = 0.85
row_idx, col_idx, _ = findnz(A)
N = size(A, 1)
# compute out-degree of each page (row)
R = Dict(1:N .=> 0)
for i in row_idx
R[i] += 1
end
prob = zeros(N)
# prob
# when R[i] == 0, prob[i] = 1/N
# when R[i] != 0, prob[i] = 1/R[i]
for (key, value) in R
if value == 0
prob[key] = 1 / N
else
prob[key] = 1 / value
end
end
# non_zero_idx_list
non_zero_idx_list = []
for i in 1:N
push!(non_zero_idx_list, findall(!iszero, A[:, i]))
end
# iterative eigen-solver
function EigenMapping(x::AbstractVector)
y = zeros(N)
# first part
@inbounds for i in 1:N
non_zero_idx = non_zero_idx_list[i]
y[i] = p * dot(x[non_zero_idx], prob[non_zero_idx])
end
# second part
y += (1 - p) / N * ones(N)
end
# solve for eigenvector
EigenLinearMap = LinearMap{Float64}(EigenMapping, N)
@time _, eig_sol = eigs(EigenLinearMap, nev=1, which=:LM, maxiter=300)
Could anyone help me, thank you in advance.