Gathering monomial coefficients of orthogonal polynomials using SpecialPolynomials.jl

Sorry, I should have been more precise indeed.
By more trouble I mean that for complex coefficients, computing the eigenvalues of the companion matrix directly shows larger differences w.r.t calling roots(), than for real coefficients.

By back substitution I mean that if I have a vector X of polynomial coefficients expressed in terms of orthogonal polynomials, and L is the matrix linking orthogonal polynomials to the canonical monomials, then instead of computing a comrade matrix, I compute the companion matrix of L*X.

I guess I opened mine at the wrong repo… Support block companion / comrade matrices ? Β· Issue #8 Β· jverzani/AMRVW.jl Β· GitHub
I will link both issues so as to not lose info.

How do you know which one is more accurate? Maybe the β€œcomrade” matrix roots are more accurate than those of roots?

In particular, you can simply evaluate the polynomial (in the original orthogonal-polynomial basis, using a Clenshaw recurrence) at the (approximate) roots and see which approach gives a smaller polynomial value.

That’s surely one way to see it :slight_smile: . I also checked using the roots provided by AMRVW.jl, but indeed I will test to see which evaluation is closer to zeros.

The Aurentz et. al. algorithm is implemented in comrade matrices close #23 by jverzani Β· Pull Request #24 Β· jverzani/SpecialPolynomials.jl Β· GitHub and the roots method will use this for monic families.

I don’t have access currently to the other two references.

Given that they should not be under any embargo by now, I will try to make them available on a public archive tomorrow.

@j_verzani @stevengj I’ve done some benchmarks against real and complex scalar polynomials using

  • roots from Polynomials
  • roots from AMRVW
  • generalized companion matrix computation (i.e for non-monic polynomials)
  • generalized comrade matrix computation (for polynomial expressed in terms of orthogonal polynomials)

I tested it for polynomials of order 10 and 50 for the moment. Also, as advised by @stevengj , I evaluated the polynomial at the predicted roots to see the residuals.

Depending on the use case, there seem to be large variations in results quality, but I am at the very edge of my understanding of this. Some feedback would be greatly appreciated.
@j_verzani the articles are posted on a public archive, but are under review, I will send you the link ASAP.

The code used for the bechmark is the following

using Polynomials,SpecialPolynomials,LinearAlgebra,AMRVW,Plots
plotlyjs()

function basis_pond(p::Int64, OP)
    L = zeros(Float64, p, p)
    for r = 1:p
        P = basis(OP, r-1)
        Q = convert(Polynomial, P)
        L[1:r,r] .= coeffs(Q)
    end
    return UpperTriangular(L)
end

function iblock!(matrix, rowiter, coliter)
    @inbounds for k in eachindex(rowiter) 
        matrix[rowiter[k], coliter[k]] = one(eltype(matrix))
    end
    return matrix
end

function generalized_companion(coeffs::Vector{T}) where T <: Number
    p = length(coeffs)
    B = Diagonal([ones(T,p-2);coeffs[end]])
    A = zeros(T,p-1,p-1)
    A[1,end] = -coeffs[1]
    @inbounds for r = 2:p-1
        A[r,r-1] = 1
        A[r,end] = -coeffs[r]
    end
    return UpperHessenberg(A),B
end

function generalized_companion(coeffs::Vector{Matrix{T}}) where T <: Number
    n,m = size(coeffs[1])
    @assert n == m "Matrix polynomial coefficients should be square matrices"

    p = length(coeffs)

    B = zeros(T,n*(p-1),n*(p-1))
    B[diagind(B)[1:end-n]] .= 1
    B[end-n+1:end,end-n+1:end] .= coeffs[end]

    A = similar(B)
    A[diagind(A,-n)] .= 1
    @inbounds for r = 0:p-2
        A[r*n.+1:n,end-n+1:end] = -coeffs[r]
    end
    return A,B
end

function generalized_companion(coeffs::Matrix{T}) where T <: Number
    pni,ni = size(coeffs)
    p = pni/ni
    @assert isinteger(p) "Number of rows of the coefficient matrix should be a multiple of the number of columns"
    p = floor(Int64,p)
    B = zeros(T,ni*(p-1),ni*(p-1))
    B[diagind(B)[1:end-ni]] .= 1
    B[end-ni+1:end,end-ni+1:end] .= coeffs[end-ni+1:end,:]

    A = similar(B)
    A[diagind(A,-ni)] .= 1
    A[:,end-ni+1:end] = -coeffs[1:end-ni,:]
    return A,B
end

function generalized_comrade(coeffs::Vector{T},OP) where T <: Union{Float64,ComplexF64}
    p = length(coeffs)
    an_p = SpecialPolynomials.an.(OP,collect(0:p-2))
    bn_p = SpecialPolynomials.bn.(OP,collect(0:p-1))
    cn_p = SpecialPolynomials.cn.(OP,collect(1:p-1))
    kn_p = SpecialPolynomials.leading_term.(OP,[p-1,p])

    B = Diagonal([ones(T,p-2);kn_p[2]*coeffs[end]])
    A = zeros(T,p-1,p-1)
    
    A[1,1] = bn_p[1]
    if p > 2
    A[2,1] = an_p[1]
    end
    @inbounds for r in 2:p-2
        A[r-1,r] = cn_p[r-1]
        A[r  ,r] = bn_p[r]
        A[r+1,r] = an_p[r]
    end
    @views A[:,end] .= .-kn_p[1] * coeffs[1:end-1] 
    if p > 2
    A[end-1,end] = A[end-1,end] + kn_p[2] * cn_p[end] * coeffs[end]
    end
    A[end  ,end] = A[end  ,end] + kn_p[2] * bn_p[end] * coeffs[end]
    return UpperHessenberg(A),B

end

function generalized_comrade(coeffs::Vector{Matrix{T}},OP) where T <: Number
    n,m = size(coeffs[1])
    @assert n == m "Matrix polynomial coefficients should be square matrices"
    p = length(coeffs)
    an_p = SpecialPolynomials.an.(OP,collect(0:p-2))
    bn_p = SpecialPolynomials.bn.(OP,collect(0:p-1))
    cn_p = SpecialPolynomials.cn.(OP,collect(1:p-1))
    kn_p = SpecialPolynomials.leading_term.(OP,[p-1,p])

    B = Diagonal([ones(T,p-2) kn_p[2]*coeffs[end]])
    A = zeros(T,p-1,p-1)
    diagAa = diagind(A,-n)
    diagAb = diagind(A)
    diagAc = diagind(A,n)

    @views A[diagAa[1:end-n]] .= repeat(an_p,n)
    @views A[diagAb[1:end-n]] .= repeat(bn_p[1:end-n],n)
    @views A[diagAc[1:end-n]] .= repeat(cn_p[1:end-n],n)
    
    for r = 0:p-2
        A[r.+1:n,end-n+1:end] .= .-kn_p[1] .* coeffs[r+1]
    end 
    A[end-2*n+1:end-n,end-n+1:end] .+= kn_p[2] .* cn_p[end] .* coeffs[end]
    A[end-  n+1:end  ,end-n+1:end] .+= kn_p[2] .* bn_p[end] .* coeffs[end]
    return A,B
end

function generalized_comrade(coeffs::Matrix{T},OP) where T <: Number
    ni = size(coeffs,2)
    p = size(coeffs,1)/ni
    # Ensure X is of an acceptable format
    if !isinteger(p)
        error("Coefficient matrix must be of dimensions ni*p Γ— ni")
    else
        p = floor(Int64,p)
    end
    an_p = SpecialPolynomials.an.(OP,collect(0:p-2))
    bn_p = SpecialPolynomials.bn.(OP,collect(0:p-1))
    cn_p = SpecialPolynomials.cn.(OP,collect(1:p-1))
    kn_p = SpecialPolynomials.leading_term.(OP,[p-1,p])

    Cn = @view coeffs[end-ni+1:end,:]

    B = zeros(T,ni*(p-1),ni*(p-1))
    iblock!(B,1:(p-2)*ni,1:(p-2)*ni)
    B[end-ni+1:end,end-ni+1:end] .= kn_p[2]*Cn
    A = similar(B)    
    imat = Matrix(I,ni,ni)
    A[1:ni,1:ni] .= bn_p[1]*imat
    if p > 2
        A[ni+1:2*ni,1:ni] .= an_p[1]*imat
    end
    shift = 1:ni
    for r in 2:p-2
        A[ni*(r-2).+shift,ni*(r-1).+shift] .= cn_p[r-1].*imat
        A[ni*(r-1).+shift,ni*(r-1).+shift] .= bn_p[r].*imat
        A[ni*(r  ).+shift,ni*(r-1).+shift] .= an_p[r].*imat
    end
    @views A[:,end-ni+1:end] .= .-kn_p[1] .* coeffs[1:end-ni,:] 
    if p > 2
        A[end-2*ni+1:end-ni,end-ni+1:end] .+= kn_p[2] .* cn_p[end] .* Cn
    end
    A[end-ni+1:end  ,end-ni+1:end] .+= kn_p[2] .* bn_p[end] .* Cn



    return A,B
end


# Define polynomial order
for n in [10,50], t in [Float64,ComplexF64], OP in [Chebyshev,Legendre]
# Generate vector of random polynomial coefficients
X = rand(t,n)

# Case 1 : polynomial is canonical
P = Polynomial(X)
# Default root computation
roots_P = roots(P)
# Computation with generalized companion matrix
A,B = generalized_companion(P.coeffs)
roots_gencomp = eigvals!(Matrix(A),Matrix(B))
# Computation with AMRVW
roots_amrvw = AMRVW.roots(co)

plot(roots_P,lt=scatter,label="Polynomials",ms=5,title="Roots_order=$(n)_Canonical_coeffstype=$(eltype(X))")
plot!(roots_amrvw,lt=scatter,label="AMRVW",ms=4)
plot!(roots_gencomp,lt=scatter,label="GenCompanion",ms=3)
savefig("Roots_order=$(n)_Canonical_coeffstype=$(eltype(X)).png")

plot(P.(roots_P),lt=scatter,label="Polynomials",ms=5,title="PVals_order=$(n)_Canonical_coeffstype=$(eltype(X))")
plot!(P.(roots_amrvw),lt=scatter,label="AMRVW",ms=4)
plot!(P.(roots_gencomp),lt=scatter,label="GenCompanion",ms=3)
savefig("PVals_order=$(n)_Canonical_coeffstype=$(eltype(X)).png")

# Generate vector of random polynomial coefficients
X = rand(t,n)
# Case 2 : polynomial is orthogonal polynomial based
P = OP(X)
# Default root computation
roots_P = roots(P)
# Computation with companion matrix
bp = basis_pond(n,OP)
co = bp*P.coeffs
# Computation with generalized companion matrix
A,B = generalized_companion(co)
roots_gencomp = eigvals!(Matrix(A),Matrix(B))
# Computation with AMRVW
roots_amrvw = AMRVW.roots(co)
# Computation with generalized comrade matrix
A,B = generalized_comrade(P.coeffs,OP)
roots_gencomr = eigvals!(Matrix(A),Matrix(B))

plot(roots_P,lt=scatter,label="Polynomials",ms=6,title="Roots_order=$(n)_op=$(OP)_coeffstype=$(eltype(X))")
plot!(roots_amrvw,lt=scatter,label="AMRVW",ms=5)
plot!(roots_gencomp,lt=scatter,label="GenCompanion",ms=4)
plot!(roots_gencomr,lt=scatter,label="GenComrade",ms=3)
savefig("Roots_order=$(n)_op=$(OP)_coeffstype=$(eltype(X)).png")

plot(P.(roots_P),lt=scatter,label="Polynomials",ms=6,title="PVals_order=$(n)_op=$(OP)_coeffstype=$(eltype(X))")
plot!(P.(roots_amrvw),lt=scatter,label="AMRVW",ms=5)
plot!(P.(roots_gencomp),lt=scatter,label="GenCompanion",ms=4)
plot!(P.(roots_gencomr),lt=scatter,label="GenComrade",ms=3)
savefig("PVals_order=$(n)_op=$(OP)_coeffstype=$(eltype(X)).png")
end

And the results :
























I need to go over your examples more carefully, but this is what I would expect up front. Both Polynomials.roots and AMRVW.roots find the eigenvalues of a companion matrix, though by different means. The advantage for AMRVW only comes in when n gets quite large, as then the n^3 scaling of eigvals(companion_matrix(p)) takes over. As well, AMRVW.roots has backward stability so for really large n would be preferred. N=50 is not large though. At this point, both seem to be as accurate. For an alternative to finding the eigenvalues of the companion matrix, the PolynomialsRoots package always showed up faster and more accurate (in terms of residual errors) in my tests up to n ~ 200 or so).

For the comrade matrix, even with size n=50 I would think that eigvals(companion_matrix(p)) would be faster and as accurate than that in the PR to SpecialPolynomials.jl, though at some value of n that would change for the same reasons as above. That algorithm is not proven backward stable in the reference, though, so large n would need the solutions checked.

I did some quick tests, and get the following:

julia> DataFrame(case1.(T, ns))  # T=Float64; ns = [10,25,30]
3Γ—6 DataFrame
 Row β”‚ n      Ξ·              roots        proots       amrvw        gcomp       
     β”‚ Int64  Float64        Float64      Float64      Float64      Float64     
─────┼──────────────────────────────────────────────────────────────────────────
   1 β”‚    10    107.518      5.71565e-15  7.75204e-16  4.82674e-15  5.0533e-15
   2 β”‚    25  11169.5        4.17834e-13  3.2743e-14   1.03356e-13  3.64146e-13
   3 β”‚    50      1.05848e8  1.58139e-9   6.67526e-11  1.23539e-10  2.50662e-10

All roughly the same, though PolynomialRoots a tad better (smaller maximum polynomial value over the identified roots).

julia> DataFrame(case2.(T, ns))
3Γ—4 DataFrame
 Row β”‚ n      Ξ·             conv         cmp         
     β”‚ Int64  Float64       Float64      Float64     
─────┼───────────────────────────────────────────────
   1 β”‚    10    57.6049     1.94289e-14  7.56062e-14
   2 β”‚    25  2548.86       2.95504e-9   2.6897e-6
   3 β”‚    50     4.78515e7  4.40678      0.226592

As anticipated in this thread, conversion from the the orthogonal basis to the standard basis loses accuracy compared to the companion matrix approach

julia> DataFrame(case3.(T, ns))
3Γ—6 DataFrame
 Row β”‚ n      Ξ·              avw          avw_βœ“        conv         cmp         
     β”‚ Int64  Float64        Float64      Float64      Float64      Float64     
─────┼──────────────────────────────────────────────────────────────────────────
   1 β”‚    10    160.907      2.93324e-11  4.00999e-17  9.55756e-10  1.57792e-10
   2 β”‚    25  77960.7        2.45473e-13  4.4044e-15   2.74531e-12  9.89575e-12
   3 β”‚    50      1.7372e10  3.4879e-14   2.98266e-14  1.06142e-9   2.04649e-8

The AVW algorithm in the SpecialPolynomials PR performs better here.

These runs are typical, but a fair amount of the time the random nature produces poorly behaved polynomials for the larger n values

The above is based on the following slight modifications to your example:

import PolynomialRoots
using DataFrames
# should use a different norm...
function condition_number(p, z)
    n = degree(p)
    sqrt(n) * (1 + norm(z)^2)^((n-2)/2) * norm(coeffs(p), 1)/norm(derivative(p)(z))
end

## Case 1
function roots_generalized_companion(ps)
    A,B = generalized_companion(ps)
    eigvals!(Matrix(A),Matrix(B))
end

function case1(T,n)
    P = Polynomial
    ps = rand(T,n)
    p = P(ps);

    (n=n,
     Ξ· =maximum(z -> condition_number(p, z), roots(p)),
     roots=maximum(norm∘p, roots(p)),
     proots=maximum(norm∘p, PolynomialRoots.roots(ps)),
     amrvw = maximum(norm∘p, AMRVW.roots(ps)),
     gcomp=maximum(norm∘p, roots_generalized_companion(ps))
     )
end

function case2(T,n)
    OP = Legendre
    ps = rand(T, n);
    p = OP(ps); # roots(p) will fall back to roots(convert(Polynomial, p))
    bp = basis_pond(n,OP);
    co = bp*p.coeffs;

    (n=n,
     Ξ· = maximum(z -> condition_number(p, z), roots(p)),
     conv = maximum(norm∘p, roots(p)),
     cmp= maximum(norm∘p, roots_generalized_companion(co))
     )
end

function case3(T,n)
    OP = MonicLegendre
    ps = rand(T, n);
    p = OP(ps); # roots(p) using PR
    q = convert(Polynomial, p);

    bp = basis_pond(n,OP);
    co = bp*p.coeffs;

    (n=n,
     Ξ· = maximum(z -> condition_number(p, z), roots(p)),
     avw=maximum(norm∘p, roots(p)),
     avw_βœ“=maximum(SpecialPolynomials.a_posteriori_check(roots(p), p)),
     conv=maximum(norm∘p, roots(q)),
     cmp=maximum(norm∘p, roots_generalized_companion(co))
     )
end

Thanks, for the tests. I wonder though what would be the best way to extend these to the matrix polynomial case… Extending roots seems out of the picture. And using comrade matrices seems quite sensitive as well.

Side-note. For the matrix polynomial case, there is the NEP-Pack / NonlinearEigenproblem.jl package, that may serve to do this using its polyeig function.

Another quick question if I may, this is something that I’ve long forgotten.
What should be a correct way to discard some eigenvalues computed from the companion matrix based on the evaluation of the residual ?
I tried to filter based on the norm of the residual normalized by the norm of the computed eigevector, but that fails at high order.
Basically, what is an acceptable residual w.r.t the order ? There is surely some big O involved, but I can’t remember this.

That’s a good question. If the numeric root, Ξ»Μƒ were nearby a root of p, Ξ»Μƒ - Ξ» = Ξ» Ο΅ say, then a simple estimate would be |p(Ξ»Μƒ)| β‰ˆ Ο΅ |Ξ»| |pβ€²(Ξ»)| with |p(Ξ»Μƒ)/pβ€²(Ξ»Μƒ)| an estimate for the error. However, that isn't quite what you get. For eigenvalues of the companion matrix solved in an unstructured way, the Ξ»Μƒare roots of a nearby polynomialpΜƒwhere∣p-pΜƒβˆ£ ≨ c ∣p∣²` (https://arxiv.org/pdf/1611.02435.pdf). But nearby polynomials, need not have nearby roots. In Table 8 of http://math.wsu.edu/faculty/watkins/pdfiles/AMVW15_SIMAX.pdf there are some examples of forward errors for a range of test polynomials.

So in some way there is no evident strategy to discard some numerical solutions, because we are not really finding the roots of the polynomial rather than the roots of a nearby one ?

A check on p(lambda)/derivative(p)(lambda) is a good one. If it is small, that’s expected, if large, a Newton step can be attempted to improve the estimate.