Result diverges on 129th iteration?

My goal is to make the implementation of my qlog function specialized for MultiVector types.

In the Grassmann.jl package, I have the generic Grassmann.qlog implementation, which has a lot of unnecessary allocations made due to the generality of the algorithm.

For the expm1,cosh,sinh methods I was able to create an improved MultiVector version of each algorithm to reduce the number of allocations needed. However, for qlog I am running into strange errors on my test examples, where the result diverges on the 129th iteration.

To evaluate the function definition for qlog_fast the following imports are needed

using Grassmann; basis"3"
import Grassmann: insert_expr,mvec,geomaddmulti!,indexbasis,binomsum_set,binomial_set

In the following experiment, the 83rd iteration causes the slow and fast version to diverge

julia> for k ∈ 125:133
           println("$k, slow: ",Grassmann.qlog(1+v12,k))
           println("$k, fast: ",Grassmann.qlog_fast(1+v12,k))
       end
125, slow: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
125, fast: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
126, slow: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
126, fast: -1.4653360823126444e16 + 4.493825380508893e16v₁₂
127, slow: -8.844033711796466e16 - 2.884872248974928e16v₁₂
127, fast: -8.844033711796466e16 - 2.884872248974928e16v₁₂
128, slow: -8.844033711796466e16 - 2.884872248974928e16v₁₂
128, fast: -8.844033711796466e16 - 2.884872248974928e16v₁₂
129, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
129, fast: 5.680961621833102e16 - 1.7409867582604496e17v₁₂
130, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
130, fast: 5.680961621833102e16 - 1.7409867582604496e17v₁₂
131, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
131, fast: 3.428056483688667e17 + 1.1189735632449072e17v₁₂
132, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
132, fast: 3.428056483688667e17 + 1.1189735632449072e17v₁₂
133, slow: -2.3369029045426035e17 - 1.7409867582604496e17v₁₂
133, fast: -2.2045371266043258e17 + 6.7515671735379e17v₁₂

The slower version gives the correct results at the end, which is defined like this

function qlog(w::T,x::Int=10000) where T<:TensorAlgebra{V} where V
    w2,f = w^2,frobenius(w)
    prod = w*w2
    S,term = w,prod/3
    norms = MVector(f,frobenius(term),f)
    k::Int = 5
    @inbounds while (norms[2]<norms[1] || norms[2]>1) && k ≀ x
        S += term
        ns = frobenius(S)
        @inbounds ns β‰ˆ norms[3] && break
        prod *= w2
        term = prod/k
        @inbounds norms .= (norms[2],frobenius(term),ns)
        k += 2
    end
    return 2S
end # http://www.netlib.org/cephes/qlibdoc.html#qlog

However, in the specialized version for MultiVector inputs, the result diverges on 129th iteration

@eval function qlog_fast(b::MultiVector{T,V,E},x::Int=10000) where {T,V,E}
    $(insert_expr((:N,:t,:out,:bs,:bn),:mvec,:T,Float64)...)
    f = frobenius(b)
    w2::MultiVector{T,V,E} = b^2
    B = value(w2)
    S = zeros(mvec(N,t))
    prod = zeros(mvec(N,t))
    term = zeros(mvec(N,t))
    S .= value(b)
    out .= value(b*w2)
    term .= out/3
    norms = MVector(f,norm(term),f)
    k::Int = 5
    @inbounds while (norms[2]<norms[1] || norms[2]>1) && k ≀ x
        S += term
        ns = norm(S)
        @inbounds ns β‰ˆ norms[3] && break
        prod .= out
        out .= 0
        # prod *= w2
        for g ∈ 1:N+1
            Y = indexbasis(N,g-1)
            @inbounds for i ∈ 1:bn[g]
                @inbounds val = B[bs[g]+i]
                valβ‰ 0 && for G ∈ 1:N+1
                    @inbounds R = bs[G]
                    X = indexbasis(N,G-1)
                    @inbounds for j ∈ 1:bn[G]
                        @inbounds geomaddmulti!(V,out,X[j],Y[i],*(prod[R+j],val))
                    end
                end
            end
        end
        term .= out/k
        @inbounds norms .= (norms[2],norm(term),ns)
        k += 2
    end
    S *= 2
    return MultiVector{t,V}(S)
end

After checking over the algorithm many times, I am unable to find what causes the issue.

The following test shows that slow version of qlog gives the correct result for log when compared with regular Complex value, note that this generalizes to full MultiVector of any dimension and so is slower than a native implementation of Complex values, since it handles a bigger number type.

julia> log(1+v12)
0.3465735909819527 + 0.7853981672307662v₁₂

julia> log(1+im)
0.34657359027997264 + 0.7853981633974483im

The qlog_fast version is supposed to help address the issue of performance for MultiVector input.

The two algorithms should be performing the EXACT same sequence of calculation, the qlog_fast version is simply a refactored version of the original, with explicit allocations to minimize time.

Is it possible that I am overwriting some array values in a way that causes an issue after 129 iterations?

The fast version of the algorithm should be able to fully work correctly, reproducing the exact same calculation steps as the slow version and converging to the same result, as with expm1,cosh,sinh.

Added some @show statements to follow the values in the loop. In the slow version, term vanishes in the last iteration but in the fast version, it doesn’t. Did you notice that?

(Also added a message to your gitter group for the package.)

Could you explain your observation in more detail?

term is supposed to vanish, since the series is supposed to converge and the terms approaches 0

in the slow version, the convergence should be correct, while in the fast it is not

Yes I can explain in more detail. Want to move it to gitter? That’s easier for pair debugging.

1 Like