Eigvals() results being slightly different when Distributed.jl are used

zhaiyusci · November 3, 2020, 12:23pm

Hi friends, recently I am working on some eigen problems in quantum physics. I need to construct and diagonalize quite a few symmetric real matrices so I am using Distributed, the standard library.

See a minimal example.

using Distributed
@everywhere using LinearAlgebra

function symrand(n)
  A=rand(n,n)
  A+=transpose(A)
  A
end

mats=[symrand(10) for _ in 1:1]

# distributed version
function exec1()
  temp=[]
  for (p, m) in Iterators.zip(Iterators.cycle(workers()), mats)
    push!(temp,remotecall(eigvals, p, m))
  end
  for j in temp
    @show fetch(j)
  end
end

# series version
function exec2()
  for m in mats
    @show eigvals(m)
  end
end

@time exec1() # ...1
@time exec1() # ...2
@time exec1()

@time exec2() # ...3
@time exec2()
@time exec2()

We can then save the codes above in eigs.jl and run with julia -p2 eigs.jl.

I find the distributed results of eigvals() (also eigvecs, eigen) returns differently (i.e., 1 and 3 differ slightly), while I can also run the same function several times and always gives same result (I.e., 1 and 2 give exactly same output).

It should be mentioned that the difference is very small. I read something about floating number arithmetic, but I am not sure about it here. Did I wrote something wrong or it is a possible bug in the library or it cannot be avoided?

Noel_Araujo · November 3, 2020, 4:24pm

The error is around 10e-15, for sure is a floating point fault; I just don’t know “where”.

Unless you need high precision results, this difference is small enough to be ignored

fph · November 3, 2020, 4:51pm

Usually there is no guarantee that linear algebra operations behave in the same way on all machines and implementations. For instance, even on the same machine and with the same compiler optimizations, changing the BLAS library you use will alter slightly your results. You can look into projects such as Reproblas if you wish to have exact reproducibility (warning: that may come at a large performance cost).

Topic		Replies	Views
Distributed Conjugate Gradient Numerical Errors Julia at Scale bug , performance , parallel , linearalgebra	9	1346	October 22, 2017
Why passing a distributed matrix to functions hurts performance? Performance performance , linearalgebra , distributed , matrices , functions	7	703	April 14, 2022
Distributed Differential Equations with large matrix Performance parallel , ode , distributed , sparse	12	1014	July 16, 2021
Strange comportment of @distributed Performance question , distributed , loops	4	669	March 25, 2020
Eigenvalues are approximately correct, not exactly New to Julia linearalgebra , eigenvalues	6	913	April 4, 2022

Eigvals() results being slightly different when Distributed.jl are used

Related topics