Question on the behavior of dot notation

As the title suggests, I have a question about the behavior of dot notation.

Background

  • I am implementing the conjugate gradient method.
  • I am trying to speed up the vector processing by using dot notation.

The problem

  • I am getting different results with and without dot notation.

I guess I don’t understand the behavior of dot notation very well…

The code

struct EqVariable
  A ::Matrix{Float64}
  b ::Vector{Float64}
end
mutable struct CgVariable
  eqvar::EqVariable
  # new
  xk::Vector{Float64}
  rk::Vector{Float64}
  pkp::Vector{Float64}
  α::Float64
  β::Float64
  # now
  pk::Vector{Float64}
end

"""
Conjugate gradient algorithm
"""
function run_conjugate_gradient!(x0::Vector{Float64},
                                 A::Matrix{Float64},
                                 b::Vector{Float64};
                                 max_iter::Int=2000, threshold::Float64=10e-6,
                                 mode::Int=1)
  if mode == 1
    iterate_CG = iterate_CG1
  else
    iterate_CG = iterate_CG2
  end

  # initial iteration
  Ab = EqVariable(A, b)
  rk = b .- A * x0
  pk = rk

  var = CgVariable(
    Ab,
    x0, rk, similar(pk), 0.0, 0.0,
    pk
  )
  @inbounds for i in 1:max_iter
    var = iterate_CG(var)
    err = sum(abs.(var.rk))
    err < threshold && break
  end
  return var.xk
end

The problem is the following function.

"""
Basic iteration of the conjugate gradient algorithm1(true)
"""
@inline function iterate_CG1(var::CgVariable)
  c = var.eqvar.A * var.pk
  sq_rk = dot(var.rk, var.rk)
  var.α = sq_rk / dot(var.pk, c)
  @. var.xk += var.α * var.pk

  @. var.rk -= var.α * c
  var.β = dot(var.rk, var.rk) / sq_rk

  @. var.pkp = var.rk + var.β * var.pk
  var.pk = var.pkp              # Differences from iterate_CG2

  return var
end


"""
Basic iteration of the conjugate gradient algorithm2(false)
"""
@inline function iterate_CG2(var::CgVariable)
  c = var.eqvar.A * var.pk
  sq_rk = dot(var.rk, var.rk)
  var.α = sq_rk / dot(var.pk, c)
  @. var.xk += var.α * var.pk

  @. var.rk -= var.α * c
  var.β = dot(var.rk, var.rk) / sq_rk

  @. var.pkp = var.rk + var.β * var.pk
  @. var.pk = var.pkp           # Differences from iterate_CG1

  return var
end

As you can see in the comments, using dot notation like iterate_CG2 will calculate the wrong answer.

Example

This is a reference from wikipedia(Conjugate gradient method - Wikipedia).

julia> A, b = Float64[4 1; 1 3], Float64[1, 2];

julia> A\b
2-element Vector{Float64}:
 0.09090909090909091
 0.6363636363636364

And my functions…

julia> run_conjugate_gradient!(Float64[2, 1], A, b; mode=1)
2-element Vector{Float64}:
 0.09090911622366192
 0.6363636009220108

julia> run_conjugate_gradient!(Float64[2, 1], A, b; mode=2)
2-element Vector{Float64}:
 0.08868153729500161
 0.6388421015088749

The difference in these results is more pronounced in another numerical example.
Why are the results different for mode=1(iterate_CG1) and mode=2(iterate_CG2)?

Version information

julia> versioninfo()
Julia Version 1.6.1
Commit 6aaedecc44 (2021-04-23 05:59 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
  JULIA_ERROR_COLOR = red

When you do var.pk = var.pkp you copy a reference to the vector and later if you do, for instance, var.pkp .= [0,0], then var.pk will also change its values equal to var.pkb content.
This behaviour does not happen if you copy only the individual entries of the vector with: @. var.pk = var.pkp. Then later assignments of var.pkp elements will not affect var.pk,

4 Likes

No relation with your question on broadcast but you may consider to unpack your fields to get a cleaner code:

to

@. pkp = rk + β * pk

I guess that you already know about Conjugate Gradients · IterativeSolvers.jl but it could be useful to have a look at this reference implementation.

3 Likes