Large execution time jumps

Hello everyone,

I am working on building a computer algebra library in Julia and have encountered some strange behavior in how the execution time increases when I multiply polynomials. Essentially the code will execute in an almost constant amount of time as n increases to some point where the time dramatically increases. Here is a plot of the situation.

In this experiment I kept the number of variables in both polynomials fixed at 5 and increased the number of terms.

This is my recursive representation of polynomials.

mutable struct RecursiveDensePolynomial
    variable::String
    coeffs::Vector{Any}  # allow both numbers and other polynomials
end

And this is what I have implemented for multiplying these polynomials.

function fft!(poly::Vector{<:Any}, omegas::Vector{UInt64}, p::UInt64)
    n = length(poly)

    bit_reverse_permutation!(poly)

    len = 2
    while len <= n
        half_len = div(len, 2)
        omega_step = div(n, len) # How far to step in the precomputed omegas table

        k = 1
        while k <= n
            omega_idx = 1 # Start with omega^0 = 1 for each block

            for j in k:(k+half_len-1)
                s = poly[j]
                # The twiddle factor, w = omega^(j-k)
                w = omegas[(j-k)*omega_step+1]
                t = mulmod(w, poly[j+half_len], p)

                poly[j] = addmod(s, t, p)
                poly[j+half_len] = submod(s, t, p)
            end
            k += len
        end
        len *= 2
    end
end

function FFT_multiplication(a::RecursiveDensePolynomial, b::RecursiveDensePolynomial)
    n = next_power_of_two(length(a.coeffs) + length(b.coeffs) - 1)
    n_inv = powermod(n, PRIME - 2, PRIME)

    omega = nth_root_of_unity(n, PRIME)
    omega_inv = powermod(omega, PRIME - 2, PRIME)

    # precompute omegas for both forward and inverse transforms
    omegas = [powermod(omega, i, PRIME) for i in 0:n-1]
    omegas_inv = [powermod(omega_inv, i, PRIME) for i in 0:n-1]

    # Pad coefficients to size n
    a_coeffs = vcat(a.coeffs, zeros(UInt64, n - length(a.coeffs)))
    b_coeffs = vcat(b.coeffs, zeros(UInt64, n - length(b.coeffs)))

     fft!(a_coeffs, omegas, PRIME)
     fft!(b_coeffs, omegas, PRIME)
 
    for i in 1:n
        @inbounds a_coeffs[i] = mulmod(a_coeffs[i], b_coeffs[i], PRIME)
    end
    result_coeffs = a_coeffs # Use a_coeffs as the result vector

    # inverse fft
    fft!(result_coeffs, omegas_inv, PRIME)

    for i in 1:n
        @inbounds result_coeffs[i] = mulmod(result_coeffs[i], n_inv, PRIME)
    end

    last_nonzero = findlast(x -> x != 0, result_coeffs)
    if isnothing(last_nonzero)
        return RecursiveDensePolynomial(a.variable, [0])
    end
    resize!(result_coeffs, last_nonzero)

    return RecursiveDensePolynomial(a.variable, result_coeffs)
end

I’m curious if anyone knows what causes large jumps in execution times like this and if there is any way of preventing it.

The coeffs field is a vector of abstract type. The compiler can’t optimize any use of it, so every multiplication, addition etc. will result in dynamic dispatch which incurs memory allocations, which will then result in garbage collections when there are many of them.

There’s a section Performance Tips · The Julia Language about avoiding containers (e.g. vectors) with abstract type. Though I see no easy way out without restructuring the algorithm, since it depends on coefficients being either integers or polynomials.

Perhaps letting the coeffs field be a vector of polynomials, and use constant polynomials instead of numbers will do it?
Like:

mutable struct RecursiveDensePolynomial
    variable::String
    coeffs::Vector{RecursiveDensePolynomial}
    isconst::Bool
    constvalue::Int
end

And check for isconstand use constvalue to terminate the recursion.

2 Likes

Alternatively use LightSumTypes.jl to define a combined type for numbers and polynomials and use that to type the vector.

1 Like