Memory-free multivariate polynomials

I was looking for a package, that allows higher-order coordinate transforms. The idea would be an interface somewhat related to Base.evalpoly but capable of handling multiple dimensions. The use-case is non-linear deformations (warps) with a finite number of parameters.

Maybe I am missing something, but it seems like the ecosystem of MultivariatePolynomial.jl is not idal for this usecase, since being free of per-call-allocations is absolutely essential for this task.

For this reason we ended up programming something by using a recursive dispatch structure. Is it worth releasing
EvalMultiPoly.jl ? Hope it is useful.

FastChebInterp.jl supports this (expressing multivariate polynomials in a Chebyshev basis, which is much better than the monomial basis if the degree is not small). Its polynomial evaluation is allocation-free.

Or did you want the polynomial construction to be allocation free too? In the multivariate case that corresponds to a quite small degree.

1 Like

Couldn’t you also call evalpoly with StaticArrays.jl coefficients, recursively?

Thanks for this useful comment. I have zero experience with Chebyshev polynomials but recently got alerted to their role in sampling at the edge of band-limited signals.
I will try to use them for the mentioned interpolation task (last lines in the readme) to see how they perform in this task. Yes, it is the evaluation that needs to be allocation-free, when applied to millios of values via broadcasting.

I don’t understand quite how you would do this. Probably I am missing the math/idea on how to express a multivariate polynomial as a recursive application of monovariate polynomials. Even if this works, what would be the advantage? Less combinatoric explosion?

AFAIK the TypedPolynomials.jl implementation of MultivariatePolynomials.jl should satisfy this, no?

Probably I just don´t understand how to use it properly. Here is my attempt:

using TypedPolynomials
@polyvar x[1:2] # assign x to a tuple of variables x1, x2, x3
sz = (100, 100)
cids = CartesianIndices(sz)

coeff = [1f0,2f0,3f0,4f0,5f0,6f0]
ap(cs) = cs[1] + cs[2]*x[1] + cs[3]*x[2] + cs[4]*x[1]*x[2] + cs[5]*x[1]^2 + cs[6]*x[2]^2
apply(ci, cs) = ap(cs)(x[1]=>ci[1], x[2]=>ci[2])
applied = apply.(cids, Ref(coeff));
@time applied .= apply.(cids, Ref(coeff));
#  0.026198 seconds (270.00 k allocations: 15.717 MiB)```

How can I automatically generate the expression for ap(cs) rather than typing it by hand?
And also: How can I avoid the allocations, which happens at the last line. Since this is an in-place assignment, it should be (essentially) allcation-free, which is not the case.

… looking at the package, I did not see where I could generate such a polynomial and use it for the coordinate transform which I have in mind.
Could you maybe give a quick example, how you would apply a Chebychev polynomial to a CartesianIndices object resulting in transformed coordinates?

The way I understand that package is that it rather uses such Polynomials to interpolate data (e.g. instead of linear interpolation or BSpline) and not for describing a (non-linear, morphing) coordinate transform?

@nsajko any comments? Is the above correct, or did I misunderstand how to use the package?

The first thing to notice in your example is that you’re using x, a global variable, in your functions, which obviously harms performance. See the Performance tips page in the Manual: Performance Tips · The Julia Language

Does this help?

julia> using TypedPolynomials

julia> function f(a, b)
           mapreduce(*, +, a, b)
       end
f (generic function with 1 method)

julia> @polyvar x[1:2]
(x₁, x₂)

julia> coefs = map(Float64, 1:6)
1.0:1.0:6.0

julia> monoms = (true, x[1], x[2], x[1]*x[2], x[1]^2, x[2]^2)
(true, x₁, x₂, x₁x₂, x₁², x₂²)

julia> f(coefs, monoms)
1.0 + 3.0x₂ + 2.0x₁ + 6.0x₂² + 4.0x₁x₂ + 5.0x₁²

Instead of mapreduce you could also try using polynomial, in case it suits you better.

supports allocation-free evaluation of polynomials on Smolyak (sparse) bases. A special case of that is the full tensor, but maybe the code is not as fast for that as a specialized implementation (did not check).

Currently it has coordinate-wise transformations to various domains, but that could be generalized, open an issue if you need it.

1 Like

Thanks for the code. My point is that the package with the function evalmultipoly generates the “monoms = …” expression under the hood, whereas in the code you show, you wrote it explicitly.

Thanks for this comment. Did you try using these polynomes for transforming CartesianIndices? Would be nice to see, how such transforms would look like, if applied to a image of a grid and whether they are indeed allocation-free when used in this way.

Sorry, I am not sure I understand the question. You mean evaluate the polynomials on NTuple{N,Int}s? (which can be CartesianIndices can be converted to).

I imagine that just works using automatic promotion rules, but if not and you need it just open an issue.

1 Like

Yes, I guess we mean the same. Here is the example code snippet to do this with EvalMultiPoly.jl, which is geared towards handling exactly this case:

julia> f = get_multi_poly(Val(2), Val(2); verbose=true)
julia> nc = get_num_multipoly_vars(Val(2), Val(2))
julia> cids = Tuple.(CartesianIndices((200,200)))
julia> cs = Tuple(rand(Float32, nc))
julia> f.(cids, Ref(cs))

The first line gets the multivariate polynomial of order two and two variables. The second line just obtains the number of coefficients that are needed for such a polynomial. The third line gets the CartesianIndices for a 200x200 array, the fourth line creates some random coefficients and the last line applies this polynomial transform to each CartesianIndex in that array, which is (apart from some constant overhead) allocation free.

Do you always evaluate your polynomial for many points at once in practice? If so, there are maybe more efficient algorithms to use. Especially if you’re tabulating the polynomial for grid inputs.

Looking into multivariate polynomials and writing this package was driven by the requirement to support non-linear coordinate transforms for image registration. We have 4 (fluorescence polarization) images which need to be accurately mapped onto a common coordinate frame. This mapping needs to be ideally be found by some fitting procedure minimizing a loss function. We do this with the DataToFunctions.jl package and now plan to incorporate this polynomials as a dependency into the package, but if there is something else that is already existing and can easily be used, we may use that one.

For efficiency: The package seems quite efficient to me and seems to even beat homogeneous coordinates using StaticArrays.jl for homogeneous coordinate transforms in terms of performance. What algorithms/package would be your suggestion for this use-case?

Did you mean that the polynomial could be saving a few multiplications by exploiting separability, like in SeparableFunctions.jl? I can imagine that this may add a further speedup, but probably only for polynomials of quite high order I imagine.

I see Wikipedia mentions using FFT, for univariate polynomials, though. There are also many papers on the topic of multivariate polynomial multipoint evaluation, but they seem to mostly concern themselves with finite fields, not with floating-point.

But if your code is fast enough, then don’t bother with looking into it, of course.

@RainerHeintzmann you may find CoordRefSystems.jl useful. We rely on floating point arithmetic in most coordinate conversions though. It is not usually an issue if you need to georeference data with cm or mm precision.

1 Like

SpectralKit.jl is not really intended to work with coefficients of the polynomials directly (it uses Chebyshev bases). It is possible but you have to iterate. Its intended uses case is getting a linear combination with a given coefficients, allocation-free. Eg

using SpectralKit, StaticArrays
dom = BoundedLinear(0, 200)
basis = smolyak_basis(Chebyshev, InteriorGrid(), SmolyakParameters(4, 2), Val(2)) ∘
    coordinate_transformations(dom, dom)
θ = randn(dimension(basis));     # random coefficients
linear_combination(basis, θ, (1, 1))

The last line should not allocate when inside a function. You can iterate the last argument on CartesianIndices, as in (continuing the example)

julia> using BenchmarkTools

julia> cids = CartesianIndices((200, 200))
CartesianIndices((200, 200))

julia> buffer = Vector{Float64}(undef, length(cids));

julia> function eval_at_θ!(buffer, basis, θ, cids)
       for (i, x) in enumerate(cids)
       buffer[i] = linear_combination(basis, θ, Tuple(x))
       end
       buffer
       end
eval_at_θ! (generic function with 2 methods)

julia> @ballocated eval_at_θ!($buffer, $basis, $θ, $cids)
0

Note that I am never collecting the CartesianIndices, since they iterate just fine online.

One caveat: using Smolyak basis as tensor basis (as above) is not ideal and is not what this package has been optimized for. If there is a demand, I could introduce a TensorBasis, it is much simple as it does not have to jump through all the hoops of Smolyak indexing.

1 Like