Integer and Rational Solutions of a Binary Quadratic Equation (5): Binary quadratic form

In this post, we are interested in finding all integer solutions of the following equation:

ax2+bxy+cy2=n.\begin{equation} ax^2 + bxy + cy^2 = n. %\label{eq:ibqf} \end{equation}

where acn0acn \neq 0, b24ac0b^2 - 4ac \neq 0. This equation can certainly be solved using methods from the generalized Pell equation, but here, we explore other methods.

The function ax2+bxy+cy2ax^2 + bxy + cy^2 is also called the binary quadratic form (BQF), which we denote as <a,b,c>\left< a,b,c \right> for brevity. The discriminant of <a,b,c>\left< a,b,c \right> is defined as Δ:=b24ac\Delta := b^2 - 4ac.

Theory of binary quadratic forms

Binary quadratic forms have been studied for centuries and many textbooks discuss them. The goal of this section is to provide a brief introduction. Advanced topics (e.g., composition) will not be discussed.

(Equivalence) We define an equivalence relation between BQFs: <a,b,c><a,b,c>\left< a,b,c \right> \sim \left< a',b',c' \right> if and only if there exists an MZ2×2M \in \ZZ^{2\times 2} such that detM=1\det M = 1 and

M(ab/2b/2c)M=(ab/2b/2c),\begin{equation} M^\T \begin{pmatrix} a & b/2 \\ b / 2 & c \\ \end{pmatrix} M = \begin{pmatrix} a' & b' / 2 \\ b' / 2 & c' \\ \end{pmatrix}, %\label{eq:equivalence} \end{equation}

The matrix determinants of both sides show that <a,b,c>\left< a,b,c \right> and <a,b,c>\left< a',b',c' \right> have the same discriminant.

(Representing numbers) One way to understand this equivalence is by examining the set of numbers it represents. We say <a,b,c>\left< a,b,c \right> represents nn if the equation ax2+bxy+cy2=nax^2 + bxy + cy^2 = n for some x,yZx, y \in \ZZ, and it properly represents nn if gcd(x,y)=1\gcd(x, y) = 1. If <a,b,c><a,b,c>\left< a,b,c \right> \sim \left< a',b',c' \right> and <a,b,c>\left< a',b',c' \right> represents nn by [x,y][x, y], then <a,b,c>\left< a,b,c \right> represents nn through M[x,y]M [x, y], which is a simple change of variables (under the constraint that detM=1\det M = 1).

(Positive definite) A BQF <a,b,c>\left< a,b,c \right> is positive definite if for any (x,y)0(x, y) \neq 0, ax2+bxy+cy2>0ax^2 + bxy + cy^2 > 0. An alternative definition states that a>0,c>0a > 0, c > 0 and Δ<0\Delta < 0.

(Lagrange-reduced form) A positive definite BQF <a,b,c>\left< a,b,c \right> is (Lagrange-)reduced if it meets both of the following conditions: (1) bac|b| \leq a \leq c, (2) if b=a|b| = a or a=ca = c, then b0b \geq 0. Every positive definite BQF is equivalent to a unique reduced BQF.

(Reduction) We have two rules to reduce a positive definite BQF, similar to the Euclidean algorithm:

<a,b,c><c,b,a>,M=(0110).\labelR.1\begin{equation} \left< a,b,c \right> \sim \left< c,-b,a \right>, \quad M = \begin{pmatrix} 0 & 1 \\ -1 & 0\end{pmatrix}. \label{R.1} \tag{R.1} \end{equation}

and for any tZt \in \ZZ,

<a,b,c><a,b2at,at2bt+c>,M=(1t01),\labelR.2\begin{equation} \left< a,b,c \right> \sim \left<a, b - 2at, at^2 - bt + c\right>, \quad M = \begin{pmatrix} 1 & -t \\ 0 & 1 \end{pmatrix}, \label{R.2} \tag{R.2} \end{equation}

A special case of \eqref{R.2} is <a,a,c><a,a,c>\left< a,a,c \right> \sim \left< a,-a,c \right> for t=1t = 1.

We can always apply \eqref{R.1} to ensure that aca \leq c, and \eqref{R.2} to ensure ba|b| \leq a by setting t=b2at = \left\lfloor \frac{b}{2a} \right\rceil. Both \eqref{R.1} and \eqref{R.2} reduces either aa or b|b| while keeping the other the same, so the process must terminate with a BQF <a,b,c>\left< a,b,c \right> where bac|b| \leq a \leq c. Finally, if a=ca = c or a=ba = |b|, we choose the one with b0b \geq 0.

See William Stein’s lecture notes and [Manguba-Glover18, Chapter 2] for more details.

Normalization

(Normalization) The equation is first normalized as follows:

  • Assume gcd(a,b,c)=1\gcd(a, b, c) = 1. Otherwise, we divide both sides by gcd(a,b,c)\gcd(a, b, c).
  • Assume gcd(a,n)=1\gcd(a, n) = 1. Otherwise, we can always find u,s,v,rZu, s, v, r \in \ZZ such that usvr=1us - vr = 1 and gcd(au2+buv+cv2,n)=1\gcd(au^2 + buv + cv^2, n) = 1, and then apply (urvs)\begin{pmatrix} u & r \\ v & s\end{pmatrix} to <a,b,c>\left< a,b,c \right>. See here for how to find (u,v)(u, v). The high level idea is that for any prime pp, it’s enough to try (1,0),(0,1),(1,1)(1, 0), (0, 1), (1, 1), and we use CRT to combine the results.

(Primitive solutions) We only care about the primitive solutions (x,y)(x, y) where g:=gcd(x,y)=1g := \gcd(x, y) = 1. If g1g \neq 1, we find the primitive solutions of ax2+bxy+cy2=n/g2ax^2 + bxy + cy^2 = n/g^2. It’s also clear that gcd(y,n)=1\gcd(y, n) = 1 for primitive solutions (x,y)(x, y).

Ellipse case (Δ<0\Delta < 0)

(This section is mostly taken from [Matthews15].)

We assume the BQF <a,b,c>\left< a,b,c \right> is positive definite. Otherwise, we negate a,b,ca, b, c and nn.

For a primitive solution (x,y)(x, y) to ax2+bxy+cy2=nax^2 + bxy + cy^2 = n, define λ:=xy1modn\lambda := x y^{-1} \bmod{|n|}, which certainly satisfies aλ2+bλ+c0(modn)a\lambda^2 + b\lambda + c \equiv 0 \pmod{|n|}. We write x=xnλyx = x'n - \lambda y, hence

ax2+bxy+cy2=n,ax2+bxy+cy2=1.ax^2 + bxy + cy^2 = n, \quad \Longleftrightarrow \quad a' x'^2 + b' x'y + c' y^2 = 1.

where

a=an,b=b2λa,c=aλ2+bλ+cn.a' = an, \quad b' = b - 2 \lambda a, \quad c' = \frac{a\lambda^2 + b\lambda + c}{n}.

Note that these two BQFs have the same discriminant.

The theorem below determines the minimum positive value of a reduced positive definite BQF:

([Matthews15, Lemma 0.1], [Manguba-Glover18, Lemma 1])

For a reduced positive definite BQF <a,b,c>\left< a, b, c \right>, the minimum positive value of ax2+bxy+cy2ax^2 + bxy + cy^2 for x,yZx, y \in \ZZ is aa, and is achieved when (x,y)(x, y) is:

  • (±1,0)(\pm 1, 0) if a<ca < c;
  • (±1,0),(0,±1)(\pm 1, 0), (0, \pm 1) if b<a=cb < a = c;
  • (±1,1)(\pm1, \mp1) if b=a=cb = a = c.

After reducing <an,b2λa,aλ2+bλ+cn>\left<an, b-2 \lambda a, \frac{a\lambda^2+b\lambda+c}{n} \right> to <a,b,c>\left< a', b', c' \right> using some MλM_{\lambda}, we can use the theorem above to determine the solutions of ax2+bxy+cy2=1a'x^2 + b'xy + c'y^2 = 1.

Here is an algorithm to solve the case:

def solve_bqf_neg_disc_primitive(a, b, c, n):
    """find primitive solutions to ax^2 + bxy + cy^2 = n.

    Assume gcd(a, b, c) = 1 and gcd(c, n) = 1, b^2 - 4ac < 0.
    """

    Δ = b*b - 4*a*c
    assert Δ < 0

    for λ in sqrtmod_all(Δ, n):
        λ = invmod(2*a, n) * (λ - b) % n

        (a2, b2, c2), M = bqf_neg_disc_reduce((a*n, b - 2*a*λ, (a*λ*λ + b*λ + c) // n))
        if a2 != 1:
            continue

        # transform back to the original equation
        M = [[n, -λ], [0, 1]] @ M

        if a2 < c2:
            sols = [[1, 0], [-1, 0]]
        elif b2 < a2:
            sols = [[1, 0], [-1, 0], [0, 1], [0, -1]]
        else:
            sols = [[1, -1], [-1, 1]]

        yield from (M @ s for s in sols)

Parabola case (Δ>0\Delta > 0)

(This section is mostly taken from [Matthews15] and [MRS17].)

Structure and transformation

(Equivalence relation) [Matthews23, Section 2] Let (x,y)(x^*, y^*) be the fundamental solution to x2Δy2=4x^2 - \Delta y^2 = 4, and (x,y)(x, y) be a solution to ax2+bxy+cy2=nax^2 + bxy + cy^2 = n, then any (x,y)(x', y') such that

(xy)=±Uk(xy),U:=(xby2cyayx+by2).\begin{pmatrix} x' \\ y' \\ \end{pmatrix} = \pm U^k \begin{pmatrix} x \\ y \end{pmatrix}, \quad U := \begin{pmatrix} \frac{x^* - by^*}{2} & -cy^* \\ ay^* & \frac{x + by^*}{2} \end{pmatrix}.

for kZk \in \ZZ is also a solution to ax2+bxy+cy2=nax^2 + bxy + cy^2 = n. Hence we can use the associated equivalence relation from the group action of a group {±Uk:kZ}\left\{ \pm U^k: k \in \ZZ \right\} on the set of all solutions, which is the Stolt equivalence.

(Partition of solution space) Similarly as the ellipse case, we define λ:=xy1modn\lambda := x y^{-1} \bmod{|n|} for a primitive solution (x,y)(x, y) and transform ax2+bxy+cy2=nax^2 + bxy + cy^2 = n to ax2+bxy+cy2=1a'x'^2 + b'x'y + c'y^2 = 1. One can also show ([Stolt57, Theorem 5]) that two primitive solutions are Stolt equivalent if and only if they have the same λ\lambda, implying that the number of Stolt equivalent classes is the finite.

Find Stolt fundamental solutions

Our goal in this subsection is to find the Stolt fundamental solution for a specific λ\lambda, which solves the equation

ax2+bxy+cy2=1,a' x'^2 + b' x'y + c'y^2 = 1,

with minimum non-negative yy (and maximum xx for tie-breaking).

With abuse of notation, we study the equation ax2+bxy+cy2=nax^2 + bxy + cy^2 = n where Δ>0\Delta > 0 is not a square, a>0a > 0 and n=±1n = \pm 1. Additionally, we require y>0y > 0. Since Δ1(mod4)\Delta \equiv 1 \pmod 4 and Δ\Delta is not a square, we have Δ5\Delta \geq 5.

We first present a very interesting result in [Serret66]:

Default

([Serret66]) Let 0<n<Δ/20 < n < \sqrt{ \Delta } / 2, and let (x,y)(x, y) be a primitive solution to ax2+bxy+cy2=nax^2 + bxy + cy^2 = n with y>0y > 0 where a>0a > 0. In addition, let ω,ω=(b±Δ)/2a\omega, \omega' = (-b \pm \sqrt{ \Delta }) / 2a be the roots of ar2+br+c=0ar^2 + br + c = 0 where ω<ω\omega < \omega'. Then p/qp/q is a convergent to either ω\omega or ω\omega'.

[MRS17] gives a simple proof: Note a(x/yω)(x/yω)=n/y2a(x / y - \omega)(x / y - \omega') = n / y^2. If x/y>ωx / y > \omega',

xyω=na(x/yω)y2<Δ/2a(ωω)y2=12y2,\frac{x}{y} - \omega' = \frac{n}{a (x / y - \omega) y^2} < \frac{\sqrt{ \Delta } / 2}{a(\omega' - \omega )y^2} = \frac{1}{2y^2},

then by Dirichlet’s approximation theorem, we conclude that x/yx / y is a convergent to ω\omega'. Similarly, if x/y<ωx / y < \omega, it’s a convergent to ω\omega.

However, the theorem is no longer true for negative nn. [Pavone86] generalized the theorem for

n=μ:=min(x,y)Z2(x,y)(0,0)ax2+bxy+cy2,|n| = \mu :=\min\limits_{\substack{(x, y) \in \ZZ^2 \\ (x, y) \neq (0, 0)}} |ax^2 + bxy + cy^2|,

where he proved that Serret’s theorem still holds unless Δ=5\Delta = 5, n<0n < 0, in which case x/yx / y can also be a special semiconvergent of both ω\omega and ω\omega'. [MRS17] further generalizes [Pavone86]‘s result, allowing n<Δ/2|n| < \sqrt{ \Delta } / 2.

Anyway, [Pavone86]‘s result is enough for us since we only care about the case n=±1n = \pm 1. One way to find the fundamental solution is to examine the candidates from the following categories to see if they solve the equation, and pick the most fundamental one:

  1. Check the smallest convergent of ω\omega before the first period ends;
  2. Check the smallest convergent of ω\omega' before the first period ends;
  3. If Δ=5\Delta = 5 and n<0n < 0, also check the special semi-convergent.

[Matthews02] also designed an algorithm based on [Pavone86], but added many optimizations.

References

General:

  • Robertson09: Robertson, J., 2009. Computing in quadratic orders. at jpr2718. org.

Ellipse case (Δ<0\Delta < 0):

  • Matthews15: Matthews, K.R., 2015. On a transformation of Lagrange [online]
  • Matthews14: Matthews, K.R., 2014. Lagrange’s Algorithm Revisited: Solving at2+btu+cu2=nat^2+ btu+ cu^2= n in the case of negative discriminant. Journal of Integer Sequences, 17(11).
    • It uses continued fraction and is more complicated. Also need to handle a few exceptional cases.
  • Manguba-Glover18: Manguba-Glover, K., 2018. Reduction Theory of Binary Quadratic Forms (Doctoral dissertation, University of Hawai ‘i at Manoa).

Parabola case (Δ>0\Delta > 0):

  • Matthews23: Matthews, K., 2023. Finding the Fundamental Solutions of ax2+bxy+cy2=nax^2+ bxy+ cy^2= n.
    • Notes on Stolt equivalence.
  • MR21: Matthews, K.R. and Robertson, J.P., 2021. On solving a binary quadratic Diophantine equation. Rocky Mountain Journal of Mathematics, 51(4), pp.1369-1385.
    • Unfortunately it’s paywalled so I haven’t read it yet. My guess is that it’s pretty similar to [Matthews23].
  • Matthews02: Matthews, K., 2002. The diophantine equation ax2+bxy+cy2=Nax^ 2+ bxy+ cy^ 2= N, D=b24ac>0D= b^ 2-4ac>0. Journal de théorie des nombres de Bordeaux, 14(1), pp.257-270.
    • Continued fraction-based algorithm.
  • MRS15: Matthews, K.R., Robertson, J.P. and Srinivasan, A., 2015. On fundamental solutions of binary quadratic form equations.
    • This bounds the scale of the Stolt fundamental solutions.
  • MRS17: Matthews, K.R., Robertson, J.P. and Srinivasan, A., 2017. Extending Theorems of Serret and Pavone. J. Integer Seq., 20(10), pp.17-10.
  • [Serret66] : Serret, J.A., 1866. Cours d’algèbre supérieure (Vol. 1). Gauthier Villars.
  • Pavone86: Pavone, M., 1986. A Remark on a Theorem of Serret. Journal of Number Theory, 23(2), pp.268-278.
  • Stolt57: Stolt, B., 1957. On a Diophantine equation of the second degree. Arkiv för Matematik, 3(4), pp.381-390.