As the title suggests, I have a question about the behavior of dot notation.
Background
- I am implementing the conjugate gradient method.
- I am trying to speed up the vector processing by using dot notation.
The problem
- I am getting different results with and without dot notation.
I guess I don’t understand the behavior of dot notation very well…
The code
struct EqVariable
A ::Matrix{Float64}
b ::Vector{Float64}
end
mutable struct CgVariable
eqvar::EqVariable
# new
xk::Vector{Float64}
rk::Vector{Float64}
pkp::Vector{Float64}
α::Float64
β::Float64
# now
pk::Vector{Float64}
end
"""
Conjugate gradient algorithm
"""
function run_conjugate_gradient!(x0::Vector{Float64},
A::Matrix{Float64},
b::Vector{Float64};
max_iter::Int=2000, threshold::Float64=10e-6,
mode::Int=1)
if mode == 1
iterate_CG = iterate_CG1
else
iterate_CG = iterate_CG2
end
# initial iteration
Ab = EqVariable(A, b)
rk = b .- A * x0
pk = rk
var = CgVariable(
Ab,
x0, rk, similar(pk), 0.0, 0.0,
pk
)
@inbounds for i in 1:max_iter
var = iterate_CG(var)
err = sum(abs.(var.rk))
err < threshold && break
end
return var.xk
end
The problem is the following function.
"""
Basic iteration of the conjugate gradient algorithm1(true)
"""
@inline function iterate_CG1(var::CgVariable)
c = var.eqvar.A * var.pk
sq_rk = dot(var.rk, var.rk)
var.α = sq_rk / dot(var.pk, c)
@. var.xk += var.α * var.pk
@. var.rk -= var.α * c
var.β = dot(var.rk, var.rk) / sq_rk
@. var.pkp = var.rk + var.β * var.pk
var.pk = var.pkp # Differences from iterate_CG2
return var
end
"""
Basic iteration of the conjugate gradient algorithm2(false)
"""
@inline function iterate_CG2(var::CgVariable)
c = var.eqvar.A * var.pk
sq_rk = dot(var.rk, var.rk)
var.α = sq_rk / dot(var.pk, c)
@. var.xk += var.α * var.pk
@. var.rk -= var.α * c
var.β = dot(var.rk, var.rk) / sq_rk
@. var.pkp = var.rk + var.β * var.pk
@. var.pk = var.pkp # Differences from iterate_CG1
return var
end
As you can see in the comments, using dot notation like iterate_CG2
will calculate the wrong answer.
Example
This is a reference from wikipedia(Conjugate gradient method - Wikipedia).
julia> A, b = Float64[4 1; 1 3], Float64[1, 2];
julia> A\b
2-element Vector{Float64}:
0.09090909090909091
0.6363636363636364
And my functions…
julia> run_conjugate_gradient!(Float64[2, 1], A, b; mode=1)
2-element Vector{Float64}:
0.09090911622366192
0.6363636009220108
julia> run_conjugate_gradient!(Float64[2, 1], A, b; mode=2)
2-element Vector{Float64}:
0.08868153729500161
0.6388421015088749
The difference in these results is more pronounced in another numerical example.
Why are the results different for mode=1(iterate_CG1)
and mode=2(iterate_CG2)
?
Version information
julia> versioninfo()
Julia Version 1.6.1
Commit 6aaedecc44 (2021-04-23 05:59 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin18.7.0)
CPU: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
JULIA_ERROR_COLOR = red