Small integer programming problem

I have a seemingly simple integer programming problem that I need a hard-coded algorithm for. Having this would help solve an issue in base Julia—interpreting floating point ranges fully correctly, if you can believe it. I’d love some help with this problem from the integer programming experts here.

Problem: Given positive integers a, b, c, d with \gcd(a, b) = \gcd(c, d) = 1, what is the smallest possible n > 0 such that

n = ia + jb = lc + kd

with nonnegative integer coefficients, i, j, k, l.

It’s easy to put an upper bound on this. I also generally have an external upper bound and am only interested in solutions better than this threshold, which can further bound the search space.

It’s fairly easy to see that this is an integer programming problem, but I’m not going to call an ILP solver from Base to interpret floating-point ranges. Since the number of variables is tiny, I’m wondering about what a good approach to solving this might be. Brute force is ok, especially when the inputs are small, but occasionally the inputs might get big, so it would be nice to have some cleverness to avoid taking forever in those situations. I’ve started reading some ILP literature, but figure that’s a little silly when we have some many experts in our community who could help.

7 Likes

Silly idea based on your observation: ca + db = ac + bd, and the equation holds even if you divide it by any non-zero number q. You want integer coefficients, so q must be a factor of a, c, b, d. And n should be as small as possible, so you want the highest common factor q=\gcd(a,c,b,d). I don’t know if the solution i = c / q, j = d / q and l=a/q, k = b/ q is optimal, though.

… but q=1 because \gcd(a, b) = \gcd(c, d) = 1, nevermind :smiley:

1 Like

Here’s a potentially motivating example:

a = 5
b = 8
c = 6
d = 7

Minimized by n = 13 = a + b = c + d which has no common factors with any of the inputs. The coefficients and additions sabotage any approaches that are based on factorization. You can also have situations like

a = 2
b = 3
c = 7
d = 8

where the best solution is n = 7 = 2a + b = c + 0d — the coefficient of some values can be zero.

My math is a bit rusty, but could it help to look at

together with
Coin problem - Wikipedia ?

2 Likes

Ugh, I have a brute-force approach, but it does not go through all i,j,k,l. It goes through all n:

# Finds a natural solution x, y for the equation a * x + b * y = c
# or returns nothing
function solve(a, b, c)
    _, x0, y0 = gcdx(a, b)

    x0 *= c
    y0 *= c

    @assert x0 > 0 || y0 > 0

    if x0 < 0
        t = ceil(Int, -x0 / b)
        x = x0 + b * t
        y = y0 - a * t
    elseif y0 < 0
        t = ceil(Int, -y0 / a)
        x = x0 - b * t
        y = y0 + a * t
    else
        x = x0
        y = y0
    end

    if x ≥ 0 && y ≥ 0
        return x, y
    else
        return nothing
    end
end

function optimize(a, b, c, d)
    n_min = max(min(a, b), min(c, d))
    n_max = a * c + b * d

    for n in n_min:n_max
        sol_ab = solve(a, b, n)
        isnothing(sol_ab) && continue

        sol_cd = solve(c, d, n)
        isnothing(sol_cd) && continue

        i, j = sol_ab
        k, l = sol_cd

        return i, j, k, l
    end

    return nothing
end


a = 2
b = 3
c = 7
d = 8
sol = optimize(a, b, c, d)
if !isnothing(sol)
    i, j, k, l = sol
    n = i * a + j * b
    @info n
    println("Found coefficients for n=$(n): $(i)a + $(j)b = $(k)c + $(l)d")
else
    println("Could not find a solution")
end
2 Likes

I’ve looked at both extensively. Bézout’s identity doesn’t seem to help, but I’ve used it elsewhere. This is related to the coin problem and to numerical semigroups more generally. In the numerical semigroup literature, for coprime a and b, the set \langle a,b \rangle = a\mathbb N + b\mathbb N is the numerical semigroup generated by a and b. The coin problem asks what the largest integer not in the semigroup is—this number is known as the Frobenius number of the semigroup. For two generators, there is a closed form solution: ab - a - b. What we’re looking at here is the intersection of two numerical semigroups, i.e. \langle a,b \rangle \cap \langle c,d \rangle. It’s easy to check that the intersection of two numerical semigroups is also a numerical semigroup. The question we’re solving here is finding the multiplicity of \langle a,b \rangle \cap \langle c,d \rangle, which is just the name given to the smallest positive number in the set, which is also its smallest generator. Unfortunately, despite scouring the literature on numerical semigroups, I can’t find anything that helps. There doesn’t seem to be any straightforward way to find the multiplicity or any other generators of an intersection of semigroups. Which brings me back to solving this directly as a integer programming problem.

I know general ILP solvers are not the right solution for Base, but in case it’s helpful for checking solutions, a JuMP formulation is

using JuMP, HiGHS

function solve(a, b, c, d)
    model = Model(HiGHS.Optimizer)
    set_silent(model)
    @variables(model, begin
        i >= 0, Int
        j >= 0, Int
        k >= 0, Int
        l >= 0, Int
        n >= 1, Int
    end)
    @constraint(model, n == i * a + j * b)
    @constraint(model, n == l * c + k * d)
    @objective(model, Min, n)
    optimize!(model)
    is_solved_and_feasible(model) || error("Did not solve model")
    return (; n = value(n), i = value(i), j = value(j), k = value(k), l = value(l))
end

e.g.

julia> solve(2, 3, 7, 8)
(n = 7.0, i = 2.0, j = 1.0, k = 0.0, l = 1.0)

julia> solve(5, 8, 6, 7)
(n = 13.0, i = 1.0, j = 1.0, k = 1.0, l = 1.0)
9 Likes

GAP has functionality to compute semigroup intersections, so it may be worth investigating how they do it.

julia> using GAP
 ┌───────┐   GAP 4.13.1 of 2024-06-11
 │  GAP  │   https://www.gap-system.org
 └───────┘   Architecture: x86_64-pc-linux-gnu-julia1.10-64-kv9
 Configuration:  gmp 6.2.1, Julia GC, Julia 1.10.5, readline
 Loading the library and packages ...
 Packages:   AClib 1.3.2, Alnuth 3.2.1, AtlasRep 2.1.8, AutPGrp 1.11, CRISP 1.4.6, Cryst 4.1.27, 
             CrystCat 1.1.10, CTblLib 1.3.9, FactInt 1.6.3, FGA 1.5.0, GAPDoc 1.6.7, IRREDSOL 1.4.4, 
             JuliaInterface 0.12.0, LAGUNA 3.9.6, Polenta 1.3.10, Polycyclic 2.16, PrimGrp 3.4.4, 
             RadiRoot 2.9, ResClasses 4.7.3, SmallGrp 1.5.3, Sophus 1.27, SpinSym 1.5.2, StandardFF 1.0, 
             TomLib 1.2.11, TransGrp 3.6.5, utils 0.85
 Try '??help' for help. See also '?copyright', '?cite' and '?authors'

julia> GAP.prompt()
gap> LoadPackage("numericalsgps");
----------------------------------------------------------------
Loading  NumericalSgps 1.3.1
For help, type: ?NumericalSgps: 
To gain profit from other packages, please refer to chapter
'External Packages' in the manual, or type: ?NumSgpsUse 
----------------------------------------------------------------
true
gap> S1 := NumericalSemigroup(5,8);
<Numerical semigroup with 2 generators>
gap> S2 := NumericalSemigroup(6,7);
<Numerical semigroup with 2 generators>
gap> S3 := Intersection(S1, S2);
<Numerical semigroup>
gap> Multiplicity(S3);
13

Update:
Looks like this is the code: numericalsgps/gap/basics2.gi at master · gap-packages/numericalsgps · GitHub
As I understand it they compute all elements below the Frobenius number for both semigroups and intersect them. I.e. basically brute force, so I guess not so much to find there after all.

2 Likes

For what it’s worth, I compared solutions of my brute force approach and the ILP provided by Eric. I randomly generated a, b, c, d \in \{ 1, \ldots, 10\,000 \}, obtained a solution from the ILP and my program and measured how long it took, compared the solutions and recorded if they differed.

After 500 experiments, I got the following results. ILP took 0.475s on average with standard deviation of 1.101s. My approach took 0.0004s on average with standard deviation of 0.0006s. I found one problem which resulted in different n: a=1985, b=1212, c=9941, d=7008. However, ILP gave higher (worse) n: 341420 = 172 * 1985 + 0 * 1212 = 28 * 9941 + 9 * 7008; my program gave 337994 = 166 * 1985 + 7 * 1212 = 34 * 9941 + 0 * 7008. I believe this could be due to the time limit I imposed on the ILP solver (20s).

Also, there can be multiple optimizers. For example:

# Format is:
# Problem
#    ILP solution of the form n = i * a + j * b = k * c + l * d
#    My solution of the same form
a=5 b=11 c=2381 d=982
  982  =  192 * 5 + 2 * 11   =   0 * 2381 + 1 * 982
  982  =  5 * 5 + 87 * 11   =   0 * 2381 + 1 * 982
a=1785 b=4418 c=17 d=20
  1785  =  1 * 1785 + 0 * 4418   =   105 * 17 + 0 * 20
  1785  =  1 * 1785 + 0 * 4418   =   5 * 17 + 85 * 20
a=8 b=35 c=401 d=3682
  401  =  37 * 8 + 3 * 35   =   1 * 401 + 0 * 3682
  401  =  2 * 8 + 11 * 35   =   1 * 401 + 0 * 3682
a=8945 b=6362 c=62 d=15
  6362  =  0 * 8945 + 1 * 6362   =   91 * 62 + 48 * 15
  6362  =  0 * 8945 + 1 * 6362   =   1 * 62 + 420 * 15
a=7122 b=3427 c=45 d=58
  3427  =  0 * 7122 + 1 * 3427   =   71 * 45 + 4 * 58
  3427  =  0 * 7122 + 1 * 3427   =   13 * 45 + 49 * 58
Code
using JuMP, HiGHS
using Statistics

function solve(x0, y0, a, b, c)
    @assert x0 > 0 || y0 > 0

    x0 = x0 * c
    y0 = y0 * c

    if x0 < 0
        t = ceil(Int, -x0 / b)
        x = x0 + b * t
        y = y0 - a * t
    elseif y0 < 0
        t = ceil(Int, -y0 / a)
        x = x0 - b * t
        y = y0 + a * t
    else
        x = x0
        y = y0
    end

    if x ≥ 0 && y ≥ 0
        return x, y
    else
        return nothing
    end
end

function solve_brute(a, b, c, d)
    _, i0, j0 = gcdx(a, b)
    _, k0, l0 = gcdx(c, d)


    n_min = max(min(a, b), min(c, d))
    n_max = a * c + b * d

    for n in n_min:n_max
        sol_ab = solve(i0, j0, a, b, n)
        isnothing(sol_ab) && continue

        sol_cd = solve(k0, l0, c, d, n)
        isnothing(sol_cd) && continue

        i, j = sol_ab
        k, l = sol_cd

        @assert a * i + b * j == n
        @assert c * k + d * l == n

        return (; n, i, j, k, l)
    end

    return nothing
end
function solve_ilp(a, b, c, d)
    model = Model(HiGHS.Optimizer)
    set_silent(model)
    set_time_limit_sec(model, 20)
    @variables(model, begin
        i >= 0, Int
        j >= 0, Int
        k >= 0, Int
        l >= 0, Int
        n >= 1, Int
    end)
    @constraint(model, n == i * a + j * b)
    @constraint(model, n == k * c + l * d)
    @objective(model, Min, n)
    optimize!(model)
    is_solved_and_feasible(model) || return nothing

    n = round(Int, value(n))
    i = round(Int, value(i))
    j = round(Int, value(j))
    k = round(Int, value(k))
    l = round(Int, value(l))

    @assert a * i + b * j == n
    @assert c * k + d * l == n

    return (; n, i, j, k ,l)
end

function generate_abcd(range)
    a, b, c, d = rand(range, 4)
    b += (a == b)
    d += (c == d)
    p = gcd(a, b)
    q = gcd(c, d)
    a, b = a ÷ p, b ÷ p
    c, d = c ÷ q, d ÷ q
    return a, b, c, d
end

function test(;m=10, high=10)
    times_ilp = Float64[]
    times_brute = Float64[]

    tricky_tasks = NTuple{12, Int64}[]
    wrong_tasks = NTuple{12, Int64}[]

    for i in 1:m
        a, b, c, d = generate_abcd(1:high)

        result_ilp = @timed solve_ilp(a, b, c, d)
        result_brute = @timed solve_brute(a, b, c, d)

        push!(times_ilp, result_ilp.time)
        push!(times_brute, result_brute.time)

        sol_ilp = result_ilp.value
        sol_brute = result_brute.value

        if sol_ilp != sol_brute
            if sol_ilp.n != sol_brute.n
                push!(wrong_tasks, (a, b, c, d,
                                    sol_ilp.i, sol_ilp.j, sol_ilp.k, sol_ilp.l,
                                    sol_brute.i, sol_brute.j, sol_brute.k, sol_brute.l))
            else
                push!(tricky_tasks, (a, b, c, d,
                                     sol_ilp.i, sol_ilp.j, sol_ilp.k, sol_ilp.l,
                                     sol_brute.i, sol_brute.j, sol_brute.k, sol_brute.l))
            end
        end

        if (i % 10) == 0
            println("$i tasks done")
        end
    end

    println("ILP times: ", round(mean(times_ilp), digits=4), " ± ", round(std(times_ilp), digits=4))
    println("BRT times: ", round(mean(times_brute), digits=4), " ± ", round(std(times_brute), digits=4))
    println()
    println("Wrong tasks:")
    for (a, b, c, d, ilp_i, ilp_j, ilp_k, ilp_l, brute_i, brute_j, brute_k, brute_l) in wrong_tasks
        ilp_n = ilp_i * a + ilp_j * b
        brute_n = brute_i * a + brute_j * b
        println("a=$a b=$b c=$c d=$d")
        println("  $ilp_n  =  $ilp_i * $a + $ilp_j * $b   =   $ilp_k * $c + $ilp_l * $d")
        println("  $brute_n  =  $brute_i * $a + $brute_j * $b   =   $brute_k * $c + $brute_l * $d")
    end
    println()
    println("Tricky tasks:")
    for (a, b, c, d, ilp_i, ilp_j, ilp_k, ilp_l, brute_i, brute_j, brute_k, brute_l) in tricky_tasks
        ilp_n = ilp_i * a + ilp_j * b
        brute_n = brute_i * a + brute_j * b
        println("a=$a b=$b c=$c d=$d")
        println("  $ilp_n  =  $ilp_i * $a + $ilp_j * $b   =   $ilp_k * $c + $ilp_l * $d")
        println("  $brute_n  =  $brute_i * $a + $brute_j * $b   =   $brute_k * $c + $brute_l * $d")
    end
end
4 Likes

Here’s another brute force solution by iteratively generating the semigroups until the smallest common element is found.

using DataStructures

function solve_heap(a, b, c, d)
    S1 = BinaryMinHeap([a, b])
    S2 = BinaryMinHeap([c, d])
    m1 = min(a, b)
    m2 = min(c, d)
    while m1 != m2
        if m1 < m2
            while first(S1) == m1
                pop!(S1)
            end
            push!(S1, m1 + a)
            push!(S1, m1 + b)
            m1 = first(S1)
        end
        if m2 < m1
            while first(S2) == m2
                pop!(S2)
            end
            push!(S2, m2 + c)
            push!(S2, m2 + d)
            m2 = first(S2)
        end
    end
    return m1
end

(Set can of course be used instead of heap, but is considerably slower.)
Edit: Fixed corner case.

3 Likes

There’s probably some connection to lattice reduction.

Here’s a first attempt using sagemath:

from sage.modules.free_module_integer import IntegerLattice

a = random_prime(2**14)
b = random_prime(2**14)
c = random_prime(2**14)
d = random_prime(2**14)
print(f"{a=} {b=} {c=} {d=}")

N = 1000 # some large enough multiplier
L = identity_matrix(4).augment(N * vector([a, b, -c, -d]))
print(L)
L = IntegerLattice(L)
v = L.shortest_vector() # may have negative coefficients
print(f"{v=}")
print(L)
B = L.basis()
er = 10
smallest_n = Infinity
# some terrible enumeration to (try to) eliminate negative coefficients
for x1 in range(-er, er+1):
    for x2 in range(-er, er+1):
        for x3 in range(-er, er+1):
            for x4 in range(-er, er+1):
                v = x1*B[0] + x2*B[1] + x3*B[2] + x4*B[3]
                l = v[0]*a + v[1] * b
                r = v[2]*c + v[3] * d
                if l != r:
                    continue
                n = l
                if n > 0 and all(x >= 0 for x in v[:4]) and n < smallest_n:
                    smallest_n = n
                    print(n, v)

which prints:

a=1231 b=10357 c=5051 d=2087
[       1        0        0        0  1231000]
[       0        1        0        0 10357000]
[       0        0        1        0 -5051000]
[       0        0        0        1 -2087000]
v=(-1, 4, 3, 12, 0)
Free module of degree 5 and rank 4 over Integer Ring
User basis matrix:
[   -1     4     3    12     0]
[  -19     3     4    -6     0]
[    5    15    39   -17     0]
[   -8    -1    -4     0 -1000]
221367 (163, 2, 43, 2, 0)
123100 (100, 0, 19, 13, 0)
72712 (17, 5, 2, 30, 0)

the last line shows the best solution that was found.

edit: in the reduced basis only the last row vector has an entry in the last column, it should be possible to skip it during enumeration (as an optimization). A sufficiently large multiplier would maybe ensure this? I’m not sure.

2 Likes

ia+jb=lc+kd has 2 symmetric solutions i=c,j=d,l=a,k=b or i=d,j=c,l=b,k=a, so maybe n_max = min(a*c+b*d, b*c+a*d) is more appropriate?

But since i,j,k,l can be 0, there are more easy symmetric solutions:

i=0,l=0, n=bd \\ i=0,k=0, n=bc\\ j=0,l=0, n=ad\\ j=0,k=0, n=ac

among which is a value lower than n_max above.
You start the search at n_min and would hit one of these easy symmetric solutions no matter how much larger n_max is, so I don’t expect this to help runtime at all.

Tried a toy example that would, but optimize(2, 3^23, 3, 2^45) == nothing. The solution should be n=6=3*2+0*3^{23}=2*3+0*2^{45} if I’m reading this right

Try the version from the benchmark (solve_brute). I guess I made some changes, which I have forgotten from yesterday :smile:

Ah I missed that details block, otherwise I would’ve just commented on the same lines from solve_brute. solve_brute(2, 3^23, 3, 2^45) == nothing as well.

I neglected to mention that the older solve(2, 3^23, 6) == (3, 0) and solve(3, 2^45, 6) == (2, 0), which is why I was confused that optimize resulted in nothing. It seems like it should’ve reached the solution n=6, but I haven’t tried debugging it.

It fails due to integer overflow from 3^23 * 2^45. But there’s no need for the upper bound, easier to just set it to typemax(Int).

1 Like

Switching to the symmetric n_max = b*c+a*d, now optimize(2, 3^23, 3, 2^45) == (3, 0, 2, 0). That’s not going to hold up in unit testing, though. The least known upper bound could definitely be left in a comment.

If you want a trivial upper bound from cross products, you can go with min(a, b) * min(c, d).

1 Like

I’ll summarize my approach so that it is more convenient for people to improve it, if someone is interested.

We are given positive integers a, b, c, d such that \gcd(a, b) = 1 and \gcd(c, d) = 1. We are trying to find smallest positive n>0 such that

n = i a + j b = k c + ld,

where i, j, k, l \in \mathbb{N} are non-negative variables.

My approach is based on finding natural solutions of the linear equations. Let’s first focus on the first one, n = ia + jb.

From the Bézout’s identity, we can find i_0, j_0 \in \mathbb{Z} such that

ai_0 + b j_0 = 1.

Multiplying by some n \in \mathbb{N}, we get

a (ni_0) + b(n j_0) = n.

Since either i_0 or j_0 is negative (unless a = b = 1), we don’t have a natural solution. Suppose i_0 is negative. We can write

a(n i_0 + b m) + b(n j_0 - am) = n,

where m \in \mathbb{N} is some number. The goal now is to find the smallest n \in \mathbb{N} such that there is some m \in \mathbb{N} that makes both ni_0 + bm and n j_0 - am non-negative. Formally:

n i_0 + bm \geq 0 \rightarrow m \geq -\frac{ni_0}{b}, \\ n j_0 - am \geq 0 \rightarrow m \leq \frac{nj_0}{a},

which gives together

n s \leq m \leq n t,

where s = -\frac{i_0}{b} and t = \frac{j_0}{a}. Now, if j_0 was negative, we would get s = -\frac{j_0}{a} and t = \frac{i_0}{b}.

So, our task transforms into finding the smallest n \in \mathbb{N} for which there is an m \in \mathbb{N} such that n s \leq m \leq n t. Subtract ns to get

0 \leq m - ns \leq n ( t - s ),

in which we set m = \lceil ns \rceil to satisfy the first inequality. It leads to finding the smallest n \in \mathbb{N} such that

\begin{aligned} \lceil ns \rceil - ns &\leq n ( t - s) \\ \lceil ns \rceil &\leq n t \end{aligned}

If we make the very same argument but for the equation n = kc + ld, we obtain

\lceil nu \rceil \leq nv,

where u = \begin{cases} -k_0 / d & \text{if } k_0 < 0 \\ -l_0 / c & \text{if } l_0 < 0 \end{cases} and v = \begin{cases} l_0 / c & \text{if } k_0 < 0 \\ k_0 / d & \text{if } l_0 < 0 \end{cases} with the Bézout’s coefficients k_0, l_0 such that c k_0 + d l_0 = 1.

So the algorithm is:

  1. Initialize n := 1
  2. Find smallest n_1 \geq n for which \lceil n_1 s \rceil \leq n_1 t holds
  3. Find smallest n_2 \geq n for which \lceil n_2 u \rceil \leq n_2 v holds
  4. Set n := \max\{ n_1, n_2 \}
  5. If n_1 \neq n_2, go to 2.
  6. Compute solution.

The solution can be computed as

\begin{aligned} i &= i_0 n - \text{sign}(i_0) b m_1, \\ j &= j_0 n + \text{sign}(i_0) a m_1, \\ k &= k_0 n - \text{sign}(k_0) d m_2, \\ l &= l_0 n + \text{sign}(k_0) c m_2, \end{aligned}

where m_1 = \lceil ns \rceil and m_2 = \lceil nu \rceil.

The problem with this approach is that for some inputs, we have to iterate over too many n before finding a match n_1 = n_2. We could improve that by initializing n with a value that is closer to the optimum. Another improvement could be in steps 2 and 3, where now I sequentially check if the condition holds (maybe we could do skip some values).

Code here:

Code
function next_lowest(u, v, n)
    while n <= typemax(Int)
        if n * v - ceil(n * u) ≥ 0
            return n
        end
        # Improvement: do larger steps
        n += 1
    end
    throw(ErrorException("not found"))
end

function solve(a, b, c, d)
    _, i0, j0 = gcdx(a, b)
    _, k0, l0 = gcdx(c, d)

    s = ifelse(i0 < 0, -i0 // b, -j0 // a)
    t = ifelse(i0 < 0, j0 // a, i0 // b)

    u = ifelse(k0 < 0, -k0 // d, -l0 // c)
    v = ifelse(k0 < 0, l0 // c, k0 // d)

    # Improvement: Initialize better
    n = 1

    while true
        n1 = next_lowest(s, t, n)
        n2 = next_lowest(u, v, n)

        n = max(n1, n2)

        if n1 == n2
            break
        end
    end

    println("Solved for n=$n")

    m1 = ceil(n * s)
    m2 = ceil(n * u)

    i = Int(i0 * n - sign(i0) * b * m1)
    j = Int(j0 * n + sign(i0) * a * m1)
    k = Int(k0 * n - sign(k0) * d * m2)
    l = Int(l0 * n + sign(k0) * c * m2)

    println("i=", i)
    println("j=", j)
    println("k=", k)
    println("l=", l)

end
8 Likes

is an “easy” upper bound lcm(a+b,c+d)*(a+b)?

or

n<= min(a,b)*min(c,d)


a<b and c<d

n <= (c,0)*(a,b) = (a,0)*(c,d) = a*c

I think moving this to general usage would be a good idea. ILP is likely neither an exclusive nor optimal way to solve the problem.