Integer and Rational Solutions to a Binary Quadratic Equation (1): Legendre equation

In this post, we’re interested in finding any or all rational solutions of the following equation:

ax2+by2+c=0,ax^2 + by^2 + c = 0,

or alternatively, all integer solutions to the Legendre equation:

ax2+by2+cz2=0,\labeleq:legendre\begin{equation} ax^2 + by^2 + cz^2 = 0, \label{eq:legendre} % \tag{Legendre Equation} \end{equation}

where abc0abc \neq 0.

Legendre equation and normal form

(Normalization) We say the Legendre equation is in normal form if abcabc is square-free. The following transformation by Gauss reduces a Legendre equation to normal form: Write bc=α2s,ca=β2t,ab=γ2ubc = \alpha^2 s, ca=\beta^2 t, ab=\gamma^2 u for square-free integers s,t,us, t, u and integers α,β,γ\alpha, \beta, \gamma, then

aαβγ(x/α)2+bβγα(y/β)2+cγαβ(z/γ)2=0,\frac{a \alpha}{\beta \gamma} (x/\alpha)^2 + \frac{b \beta}{\gamma \alpha} (y/\beta)^2 + \frac{c \gamma}{\alpha \beta} (z/\gamma)^2 = 0,

and aαβγ,bβγα,cγαβ\frac{a \alpha}{\beta \gamma}, \frac{b \beta}{\gamma \alpha}, \frac{c\gamma}{\alpha \beta} are square-free and pair-wise coprime.

(Solubility) Legendre gave an elegant theorem to determine the solubility of an equation in normal form:

Solvability of Legendre Equations

Legendre Equation \eqref{eq

} in normal form has a non-trivial integer solution iff. a,b,ca, b, c have different sign, and bc,ca,ab-bc, -ca, -ab are quadratic residues modulo a,b,ca, b, c respectively.

The if direction follows by considering modulo a,b,ca, b, c individually.

Find a single non-trivial integer solution

Brute-force

Holtz proved that there exists a small non-trivial solution:

Small solution, Holzer

If Equation \eqref{eq

} in normal form has a non-trivial solution, then there exists a non-trivial solution (x^,y^,z^)(\hat x, \hat y, \hat z) such that:

x^2bc,y^2ca,z^2ab.\hat x^2 \leq |bc|, \quad \hat y^2 \leq |ca|, \quad \hat z^2 \leq |ab|.

Additionally, bc|bc| is a square if and only if bc=1|bc| = 1, since bcbc is square-free. The same holds for ca|ca| and ab|ab|.

See [Mordell, Chapter 7, Theorem 5], [CR03] and [CM97] for more details.

Legendre’s descent

(This subsection is mostly taken from [Aitken, Section 5].)

Apparently, ax2+by2+cz2=0ax^2 + by^2 + cz^2 = 0 is equivalent to (ac)x2+(bc)y2=(cz)2(-ac) x^2 + (-bc) y^2 = (cz)^2. Legendre’s descent is based on the following observation:

ax2+by2=z2a(ux+z)2+b(by)2=(uz+ax)2.ax^2 + by^2 = z^2 \quad \Longleftrightarrow \quad a ( ux + z)^2 + b' (b y)^2 = (uz + ax)^2.

for u2=a+bbu^2 = a + b'b. It allows us to reduce the problem ax2+by2=z2ax^2 + by^2 = z^2 to another problem ax2+by2=z2ax^2 + b' y^2 = z^2. When bmax(2,a)|b| \geq \max(2, |a|), the other problem is smaller as we can always choose ub/2|u| \leq |b|/2 so b(u2+a)/bb/4+1<b|b'| \leq (u^2 + |a|) / |b| \leq |b|/4+1 < |b|.

The algorithm runs as follows:

# given a, b, find (x, y, z) such that a x^2 + b y^2 = z^2
def solve(a, b):
    assert is_squarefree(a) and is_squarefree(b)
    assert a >= 0 or b >= 0

    if abs(a) > abs(b):
        y, x, z = solve(b, a)
        return (x, y, z)

	if a == 0: return (1, 0, 0)
	if b == 1: return (0, 1, 1)
	if a == 1: return (1, 0, 1)

    # find u such that u^2 == a (mod b) and -b / 2 <= u < b / 2
    # WHAT IF u DOES NOT EXIST?
    u = sqrtmod(a, b)
    u = min(u, b - u)
    b2 = (u * u - a) // b

    # b2 = sqf_b2 * sq_b2^2 where sqf_b2 is square-free
    sqf_b2, sq_b2 = squarefree(b2)

    x2, y2, z2 = solve(a, sqf_b2)
    return (
	    u * x2 + z2,
	    sqf_b2 * sq_b2 * y2,
	    u * z2 + a * x2,
    )

The key point of the algorithm is that uu must exist during all recursions. It’s proved that if ax2+by2+cz2=0ax^2 + by^2 + cz^2 = 0 is in normal form, uu always exists when calling solve(ac,bc)\operatorname*{solve}(-ac, -bc).

See [CR03, Section 2.3], [Aitken] and [Hillgarter] for more details. [CR03] also optimizes the method but it’s beyond the scope of the post.

Lattice-based algorithm

(This section is taken from [CR03, Section 2.6].)

Another idea arises when going deeper in Legendre’s theorem.

We first discuss the case where any of ab,bc,ca|ab|, |bc|, |ca| is 1. W.L.O.G. we assume ab=1|ab| = 1.

  • If aba \neq b, then (x,y,z)=(1,1,0)(x, y, z) = (1, 1, 0) is a non-trivial solution.
  • If a=ba = b, we can use Cornacchia’s algorithm to solve x2+y2=c/ax^2 + y^2 = -c / a, and (x,y,1)(x, y, 1) is a non-trivial solution.

Now we consider the general case. We define a set of points LZ3\mathcal{L} \subset \ZZ^3:

L={(x,y,z): byλaz(moda),czλbx(modb),axλcy(modc),ax2+by2+cz20(mod2abc)},\begin{aligned} \mathcal{L} = \{ (x, y, z): \ & by \equiv \lambda_a z \pmod{|a|}, \\ & cz \equiv \lambda_b x \pmod{|b|}, \\ & ax \equiv \lambda_c y \pmod{|c|},\\ & a x^2 + b y^2 + c z^2 \equiv 0 \pmod{|2abc|} \}, \end{aligned}

where λa\lambda_a is an arbitrary solution to λ2bc(moda)\lambda^2 \equiv -bc \pmod{|a|}, similarly for λb\lambda_b and λc\lambda_c. Let (xˉ,yˉ,zˉ)(\bar x, \bar y, \bar z) be the non-zero point in L\mathcal{L} minimizing ax2+by2+cz2|a| x^2 + |b| y^2 + |c| z^2. It can be proved that there is a small solution (x^,y^,z^)(\hat x, \hat y, \hat z) in L\mathcal{L}, which can be used to bound axˉ2+byˉ2+czˉ2|a\bar x^2 + b\bar y^2 + c \bar z^2|:

axˉ2+byˉ2+czˉ2ax^2+by^2+cz^2<2abc.\left |a \bar x^2 + b \bar y^2 + c \bar z^2 \right| \leq |a| \hat x^2 + |b| \hat y^2 + |c| \hat z^2 < 2|abc|.

so we conclude that axˉ2+byˉ2+czˉ2=0a \bar x^2 + b \bar y^2 + c \bar z^2 = 0 and the minimizer itself satisfies Holzer’s condition.

The final step is to determine the minimizer. [CR03, Section 2.6, p.17] shows that L\mathcal{L} is a lattice, that is, L={wB:wZ3}\mathcal{L} = \{\mathbf{w} \mathbf{B}: \mathbf{w} \in \ZZ^3 \} for a (row-based) basis BZ3×3\mathbf{B} \in \ZZ^{3 \times 3}. It also gives an explicit expression for a basis B\mathbf{B}. Moreover, the shortest vector in a 3D lattice can be found efficiently using LLL, as long as we have an arbitrary basis B\mathbf{B} for L\mathcal{L}:

[CR03, Lemma 2.7, p.16]

Let L={wB:wZ3}\mathcal{L} = \{\mathbf{w} \mathbf{B}: \mathbf{w} \in \ZZ^3\} be a 3D lattice where B\mathbf{B} is LLL-reduced. Then the shortest vector in L\mathcal{L} is in the form of wB\mathbf{w}\mathbf{B} where w{1,0,1}3\mathbf{w} \in \{-1, 0, 1\}^3.

Here is the pseudocode:

def solve(a, b, c):
    assert (abs(a) == 1) + (abs(b) == 1) + (abs(c) == 1) < 2

	f = lambda v: sum(v * v * [a, b, c])

    def get_B():
        λ_a = sqrtmod(-b * c, a)
        λ_b = sqrtmod(-a * b, c)
        λ_c = sqrtmod(-a * c, b)

        u, a_2 = invmod(b, c), invmod(a, b * c)
        v, b_2 = (1 - u * b) // c, (1 - a * a_2) // (b * c)

        α = b_2 * c * λ_a % a
        β = u * a_2 * b * λ_b % (b * c)
        γ = v * a_2 * c * λ_c % (b * c)

        # the basis of {(x, y, z): a x^2 + b y^2 + c z^2 = 0 mod abc}
        B1 = np.array([
            [b * c, 0, 0],
            [a * β, a, 0],
            [α * β + γ, α, 1],
        ])

        # the basis of {(x, y, z): a x^2 + b y^2 + c z^2 = 0 mod 2abc}
        eps = lambda v: f(v) // (a * b * c) % 2
        r = next(r for r in B1 if eps(r) != 0)
        return np.array([v if eps(v) == 0 else v + r for v in B])

    B = LLL(get_B(), dot=lambda u, v: sum(u * v * np.abs([a, b, c])))

    for w in product([-1, 0, 1], [-1, 0, 1], [-1, 0, 1]):
        if w != (0, 0, 0) and f(v := w @ B) == 0:
            return v

    # unreachable!

Check out [CR03, Section 2.6] and [CM97] for more details. [CM97] uses a different lattice and [CR03] improves upon it.

A single non-trivial rational solution to xQx=0\mathbf{x}^\top Q \mathbf{x} = 0 with QZd×dQ \in \ZZ^{d \times d}

Assume detQ0\det Q \neq 0 and QQ is symmetric. We here only list a few references below:

  • [Simon05] removes the need of diagonalization, which often bloats the size of coefficients. It gives an algorithm where d=3d = 3. It works by reducing QQ repeatedly until detQ=±1\det Q = \pm 1, and then solving the case where detQ=±1\det Q = \pm 1.
  • [Simon06] extends [Simon05] and gives an algorithm for d>3d > 3. However, I have not read the paper.

All non-trivial solutions from a single non-trivial solution

Given a single non-trivial solution x^Qd\mathbf{\hat x} \in \QQ^d to a quadratic equation F(x)=0F(\mathbf{x}) = 0, we parametrize xQd\mathbf{x} \in \QQ^d by x=x^tu\mathbf{x} = \mathbf{\hat x} - t\mathbf{u}, where uZd,tQ\mathbf{u} \in \ZZ^d, t \in \QQ and u\mathbf{u} is primitive (i.e., gcd(u)=1\gcd(\mathbf{u}) = 1).

Since F(x^)=0F(\mathbf{\hat x}) = 0, we solve tt in F(x^tu)=0F(\mathbf{\hat x} - t\mathbf{u}) = 0 using the second-order Taylor expansion of FF at x^\mathbf{\hat x}:

F(x^)tuF(x^)+t22u2F(x^)u=0t=uF(x^)12u2F(x^)u.F(\mathbf{\hat x}) - t \mathbf{u}^\T \nabla F(\mathbf{\hat x}) + \frac{t^2}{2} \mathbf{u}^\T \nabla^2 F(\mathbf{\hat x}) \mathbf{u} = 0 \quad \Longrightarrow \quad t = \frac{\mathbf{u}^\T \nabla F(\mathbf{\hat x})}{\frac{1}{2} \mathbf{u}^\T \nabla^2 F(\mathbf{\hat x}) \mathbf{u}}.

Since u\mathbf{u} and u-\mathbf{u} produce the same solution, we require that u\mathbf{u} is lexicographically larger than u-\mathbf{u}. Note this parametrization also generates the initial solution x^\mathbf{\hat x} when uF(x^)=0\mathbf{u}^\top \nabla F(\mathbf{\hat x}) = 0, i.e., u\mathbf{u} is tangent to the curve.

Connection to integer solutions of a homogenous ternary quadratic equation

A homogeneous polynomial F(x)F(\mathbf{x}) of degree dd satisfies F(λx)=λdF(x)F(\lambda \mathbf{x}) = \lambda^d F(\mathbf{x}), so every rational solution can be normalized to a primitive integer solution. Furthermore, any primitive integer solution x\mathbf{x} with x10\mathbf{x}_1 \neq 0 is equivalent to a rational solution of a heterogeneous equation F([1,x])=0F([1, \mathbf{x'}]) = 0 for xQd1\mathbf{x}' \in \QQ^{d-1}.

When the goal is to enumerate integral solutions, a good strategy is to first find a non-trivial rational solution to F(x)=0F(\mathbf{x}) = 0 and then apply the parametrization trick to a heterogeneous equation to avoid duplicates.

Consider the following example. We consider the homogenous equation ax2+bxy+cy2=dz2ax^2+bxy+cy^2=d z^2 in Z3\ZZ^3 with a non-trivial primitive solution (x0,y0,z0)(x_0, y_0, z_0) where z00z_0 \neq 0, which is equivalent rational solutions to the heterogenous equation ax2+bxy+cy2d=0a x^2 + b xy + c y^2 - d = 0 with a non-trivial solution x^=(x0/z0,y0/z0)\mathbf{\hat x} = (x_0 / z_0, y_0 / z_0). Write u=[u,v]\mathbf{u} = [u, v] and apply the results above:

uF(x^)=z01(u(2ax0+by0)+v(bx0+2cy0)),12u2F(x^)u=au2+buv+cv2.\begin{aligned} \mathbf{u}^\top \nabla F(\mathbf{\hat x}) & = z_0^{-1}(u (2ax_0 + b y_0) + v(b x_0 + 2c y_0) ), \\ \frac{1}{2} \mathbf{u}^\T \nabla^2 F(\mathbf{\hat x}) \mathbf{u} & = a u^2 + b uv + c v^2. \\ \end{aligned}

After simplification, we have

(xyz)=Ar,A:=(ax0+by02cy0cx0ay02ax0cy0+bx0az0bz0cz0),r:=(u2uvv2),\begin{pmatrix} x \\ y \\ z \end{pmatrix} = A \mathbf{r}, \quad A := \begin{pmatrix} ax_0 + by_0 & 2cy_0 & -cx_0 \\ -ay_0 & 2ax_0 & c y_0 + b x_0 \\ a z_0 & b z_0 & c z_0 \\ \end{pmatrix}, \quad \mathbf{r} := \begin{pmatrix} u^2 \\ uv \\ v^2 \end{pmatrix},

which are exactly the Equations (8-10) in Wolfram MathWorld. The equations might not generate primitive solutions directly, but they do produce all primitive tuples after normalization.

To use the equations to generate bounded primitive solutions, the common factor g:=gcd(Ar)g := \gcd(A \mathbf{r}) needs to be determined. It can be shown that gg divides detA\det A by writing r\mathbf{r} using ArA \mathbf{r}:

r=A1Ar=\adjAdetAAr,\mathbf{r} = A^{-1} A \mathbf{r} = \frac{\adj A}{\det A} A \mathbf{r},

and observing that if gg doesn’t divide detA\det A, gcd(r)\gcd(\mathbf{r}) can’t be 1.

In our case, detA=Δdz03\det A = -\Delta d z_0^3 where Δ=b24ac\Delta = b^2 - 4ac. The bound can be slightly improved to Δdz02\Delta d z_0^2 by finding the LCM of denominators in A1A^{-1}:

A1=1Δdz02(Δx0+cTcS2cdz0aScTbdz0aTΔy0+aS2adz0),{S:=2cy0+bx0,T:=2ax0+by0.A^{-1} = \frac{1}{\Delta d z_0^2} \begin{pmatrix} \Delta x_0 + cT & cS & -2cdz_0 \\ -aS & -cT & b d z_0 \\ aT & \Delta y_0 + a S & -2adz_0 \end{pmatrix}, \quad \begin{cases} S := 2cy_0+bx_0, \\ T := 2ax_0+by_0. \end{cases}

If b=0b = 0, the denominator can be further reduced to 2acdz022acdz_0^2 instead of 4acdz024acdz_0^2.

References

  • Rathnayake13: Rathnayake T., 2013. Solving Ternary Quadratic forms by descent method [online]
  • [Mordell]: Mordell, L.J., 1969. Diophantine Equations: Diophantine Equations (Vol. 30). Academic press.
  • CR03: Cremona, J. and Rusin, D., 2003. Efficient solution of rational conics. Mathematics of Computation, 72(243), pp.1417-1441.
  • Simon05: Simon, D., 2005. Solving quadratic equations using reduced unimodular quadratic forms. Mathematics of Computation, 74(251), pp.1531-1543.
  • Simon06: Simon, D., Quadratic equations in dimensions 4, 5 and more. Preprint, 2005 [online]
  • CM97: Cochrane, T. and Mitchell, P., 1998. Small solutions of the Legendre equation. Journal of Number Theory, 70(1), pp.62-66.
  • Aitken: Wayne Aitken, Legrendre’s Theorem, Legrange’s Descent
  • Hillgarter: Hillgarter, E., 1996. Rational points on conics. na.
  • [Holzer](Minimal Solutions of Diophantine Equations): Holzer, L., 1950. Minimal solutions of diophantine equations. Canadian Journal of Mathematics, 2, pp.238-244.
    • It’s weird that I can’t find the first name of Holzer…