 # Result diverges on 129th iteration?

My goal is to make the implementation of my `qlog` function specialized for `MultiVector` types.

In the Grassmann.jl package, I have the generic `Grassmann.qlog` implementation, which has a lot of unnecessary allocations made due to the generality of the algorithm.

For the `expm1,cosh,sinh` methods I was able to create an improved `MultiVector` version of each algorithm to reduce the number of allocations needed. However, for `qlog` I am running into strange errors on my test examples, where the result diverges on the 129th iteration.

To evaluate the function definition for `qlog_fast` the following imports are needed

``````using Grassmann; basis"3"
``````

In the following experiment, the 83rd iteration causes the slow and fast version to diverge

``````julia> for k ∈ 125:133
println("\$k, slow: ",Grassmann.qlog(1+v12,k))
println("\$k, fast: ",Grassmann.qlog_fast(1+v12,k))
end
125, slow: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
125, fast: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
126, slow: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
126, fast: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
127, slow: -8.844033711796466e16 - 2.884872248974928e16v₁₂
127, fast: -8.844033711796466e16 - 2.884872248974928e16v₁₂
128, slow: -8.844033711796466e16 - 2.884872248974928e16v₁₂
128, fast: -8.844033711796466e16 - 2.884872248974928e16v₁₂
129, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
129, fast: 5.680961621833102e16 - 1.7409867582604496e17v₁₂
130, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
130, fast: 5.680961621833102e16 - 1.7409867582604496e17v₁₂
131, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
131, fast: 3.428056483688667e17 + 1.1189735632449072e17v₁₂
132, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
132, fast: 3.428056483688667e17 + 1.1189735632449072e17v₁₂
133, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
133, fast: -2.2045371266043258e17 + 6.7515671735379e17v₁₂
``````

The slower version gives the correct results at the end, which is defined like this

``````function qlog(w::T,x::Int=10000) where T<:TensorAlgebra{V} where V
w2,f = w^2,frobenius(w)
prod = w*w2
S,term = w,prod/3
norms = MVector(f,frobenius(term),f)
k::Int = 5
@inbounds while (norms<norms || norms>1) && k ≤ x
S += term
ns = frobenius(S)
@inbounds ns ≈ norms && break
prod *= w2
term = prod/k
@inbounds norms .= (norms,frobenius(term),ns)
k += 2
end
return 2S
end # http://www.netlib.org/cephes/qlibdoc.html#qlog
``````

However, in the specialized version for `MultiVector` inputs, the result diverges on 129th iteration

``````@eval function qlog_fast(b::MultiVector{T,V,E},x::Int=10000) where {T,V,E}
\$(insert_expr((:N,:t,:out,:bs,:bn),:mvec,:T,Float64)...)
f = frobenius(b)
w2::MultiVector{T,V,E} = b^2
B = value(w2)
S = zeros(mvec(N,t))
prod = zeros(mvec(N,t))
term = zeros(mvec(N,t))
S .= value(b)
out .= value(b*w2)
term .= out/3
norms = MVector(f,norm(term),f)
k::Int = 5
@inbounds while (norms<norms || norms>1) && k ≤ x
S += term
ns = norm(S)
@inbounds ns ≈ norms && break
prod .= out
out .= 0
# prod *= w2
for g ∈ 1:N+1
Y = indexbasis(N,g-1)
@inbounds for i ∈ 1:bn[g]
@inbounds val = B[bs[g]+i]
val≠0 && for G ∈ 1:N+1
@inbounds R = bs[G]
X = indexbasis(N,G-1)
@inbounds for j ∈ 1:bn[G]
end
end
end
end
term .= out/k
@inbounds norms .= (norms,norm(term),ns)
k += 2
end
S *= 2
return MultiVector{t,V}(S)
end
``````

After checking over the algorithm many times, I am unable to find what causes the issue.

The following test shows that slow version of `qlog` gives the correct result for `log` when compared with regular `Complex` value, note that this generalizes to full `MultiVector` of any dimension and so is slower than a native implementation of `Complex` values, since it handles a bigger number type.

``````julia> log(1+v12)
0.3465735909819527 + 0.7853981672307662v₁₂

julia> log(1+im)
0.34657359027997264 + 0.7853981633974483im
``````

The `qlog_fast` version is supposed to help address the issue of performance for `MultiVector` input.

The two algorithms should be performing the EXACT same sequence of calculation, the `qlog_fast` version is simply a refactored version of the original, with explicit allocations to minimize time.

Is it possible that I am overwriting some array values in a way that causes an issue after 129 iterations?

The fast version of the algorithm should be able to fully work correctly, reproducing the exact same calculation steps as the slow version and converging to the same result, as with `expm1,cosh,sinh`.

Added some `@show` statements to follow the values in the loop. In the slow version, `term` vanishes in the last iteration but in the fast version, it doesn’t. Did you notice that?

`term` is supposed to vanish, since the series is supposed to converge and the terms approaches 0