Extended Lucas theorem

Mar 25, 2023 4 min read CS/Math > Number Theory

中文版本请见这里。

$\newcommand{\stirlingI}[2]{\genfrac{[}{]}{0pt}{}{#1}{#2}}$ Lucas theorem is quite well known for compute $\binom{n}{m} \pmod{p}$ :

Let $p$ be a prime, $n = u_1 p + u_2$ ， $m = v_2 p + v_2$ where $0 \leq u_2, v_2 < p$ , then we have
$\binom{n}{m} \equiv \binom{u_1}{v_1} \binom{u_2}{v_2} \pmod{p}.$

However, Lucas theorem only works for prime $p$ . For composites CRT can rescue us, but we still have to solve the hard case where $p$ is a prime power, which this post mainly discusses. The post borrows a lot from a post by prabowo (although I guess he also references min_25’s blogpost), while having its own novelties (mostly the upper bound).

Andrew Granville also has a paper on Extended Lucas theroem. However, there are too many cases in the paper and the overall time complexity is also high. I’d never write it again after I did it once.

Analysis on $n!$

Define $E(n) = \max_k \{p^k | n!\}$ as the number of trailing 0’s of the base- $p$ representation of $n!$ . The efficient algorithm for computing $E(n)$ is trivial. Now we define

(n!)_p = \prod_{i=1, p \nmid i}^n i,

that is, the product of integers in $[n]$ which are not multiples of $p$ . Now, we have

n! = p^{E(n)} \prod_{k=0} \left( \left\lfloor n/p^k \right\rfloor !\right)_p,

and our task is to compute $(n!)_p \bmod p^e$ .

Suppose $n = up + v$ where $0 \leq v < m$ . We may have two terms for $(n!)_p$ :

(n!)_p = \left( \prod_{i=0}^{u-1} \prod_{j=1}^{p-1} (ip + j) \right) \left( \prod_{i=1}^v (up + j) \right).

Note that both terms have a structure of $\prod_{j=1}^r (ip + j)$ , which can be seen as a polynomial of $p$ . What’s more interesting is that we only care about the polynomial modulo $p^e$ . More precisely, notice that

\prod_{i=0}^{k-1} (n + i) = \sum_{i=0}^k \stirlingI{k}{i} n^i,

where $\stirlingI{k}{i}$ is the Stirling number of the first kind satisfying

\stirlingI{n + 1}{k} = n \stirlingI{n}{k} + \stirlingI{n}{k-1}, \quad \stirlingI{0}{0} = 1, \stirlingI{0}{n} = \stirlingI{n}{0} = 0.

Therefore, we may represent $(n!)_p$ by

\begin{aligned} (n!)_p & = \left( \prod_{i=0}^{u-1} \prod_{j=1}^{p-1} (ip + j) \right) \left( \prod_{i=1}^v (up + j) \right) \\ & = \left( \prod_{i=0}^{u-1} \sum_{k=1}^{p} \stirlingI{p}{k} (ip)^{k-1} \right) \left( \sum_{k=1}^{v} \stirlingI{v+1}{k} (up)^{k-1} \right) \\ & \equiv \left( \prod_{i=0}^{u-1} \sum_{k=1}^{e} \stirlingI{p}{k} (ip)^{k-1} \right) \left( \sum_{k=1}^{e} \stirlingI{v+1}{k} (up)^{k-1} \right) \pmod{p^e} \\ & = \stirlingI{p}{1}^u \left( \prod_{i=0}^{u-1} \sum_{k=1}^{e} \frac{\stirlingI{p}{k}}{\stirlingI{p}{1}} (ip)^{k-1} \right) \left( \sum_{k=1}^{e} \stirlingI{v+1}{k} (up)^{k-1} \right). \end{aligned}

We further define

f_{p, e}(u) := \prod_{i=0}^{u-1} \sum_{k=1}^{e} \frac{\stirlingI{p}{k}}{\stirlingI{p}{1}} (ip)^{k-1} \bmod p^e = \prod_{i=0}^{u-1} \left( 1 + \sum_{k=1}^{e-1} \frac{\stirlingI{p}{k+1}}{\stirlingI{p}{1}} (ip)^k \right) \bmod{p^e}.

At this point, we have three terms for $(n!)_p$ : $\stirlingI{p}{1}^u$ ， $f_{p, e}(u)$ and $\sum_{k=1}^{e} \stirlingI{v+1}{k} (up)^{k-1}$ . The first term and the last term are straightforward to compute, so we’ll focus on the middle term ( $f_{p, e}(u)$ ).

Here comes a useful observation: $f_{p, e}(u)$ is polynomial in $u$ .

Prove $f_{p, e}(u)$ is polynomial in $u$

We here prove a more general statement:

Lemma: Suppose for any $1 \leq k < e$ , $w_k(u)$ is a polynomial in $u$ , and
$g(x, u) = \prod_{i=0}^{u-1} \left( 1 + \sum_{k=1}^{e-1} w_k(u) x^k \right) \mod x^e,$
then $g$ is a polynomial in $u$ with a bounded degree
$\deg_u g(x, u) \leq \lambda (e-1), \quad \lambda := \max_{1 \leq k < e} \frac{\deg w_k + 1}{k}.$

Proof: For $0 \leq k < e$ , let $h_k(u)$ be the coefficient of $x^k$ in $g(x, u)$ . We will prove that $h_k(u)$ is polynomial in $u$ with degree $\leq \lambda k$ . With the property of $h_k(u)$ ahead, we may simple conclude

g(x, u) = \sum_{k=0}^{e-1} h_k(u) x^k \quad \Longrightarrow \quad \deg_u g \leq \max_{0 \leq k < e} \deg h_k \leq \lambda (e-1).

Here we use induction on $k$ :

When $k = 0$ , $h_k(u) = 1$ .
Suppose the statement holds for $0 < k < e$ , we have $h_k(u) = h_{k-1}(u-1) + \sum_{t=0}^{k-1} h_t(u-1) w_{k-t}(u - 1) = h_{k-1}(0) + \sum_{t=0}^{k-1} \sum_{u' \geq 0}^{u-1} h_t(u') w_{k-t}(u').$ As $h_t(u') w_{k-t}(u')$ is polynomial in $u'$ with degree $\leq \deg h_t + \deg w_{k-t}$ , $h_k$ is also polynomial in $u$ with degree $\deg h_k \leq \max_{0 \leq t < k} \left( \deg h_t + \deg w_{k-t} + 1 \right),$ where the additional constant of 1 comes from summation over $u'$ . The definition of $\lambda$ and induction give $\deg h_k \leq \max_{0 \leq t < k} \left( \lambda t + \lambda (k - t) \right) = \lambda k.$ Indeed the recursion on $\deg h_k$ is just a Knapsack problem, so a trivial upper bound is just the item with largest density.

QED.

Bound the degree of $f_{p, e}(u)$

We still have one question to answer: given that $f_{p, e}(u)$ is a polynomial, how do we bound its degree? We’d like to use the lemma above, but the problem, how do we define $w_k(u)$ ?

If we use the original definition, we may have

f_{p, e}(u) = \prod_{i=0}^{u-1} \left( 1 + \sum_{k=1}^{e-1}\frac{\stirlingI{p}{k+1}}{\stirlingI{p}{1}} (ip)^k \right) \mod p^e,

that is, $1 \leq k < e$ , $w_k(i) = \frac{\stirlingI{p}{k+1}}{\stirlingI{p}{1}} i^k$ . This definition leads to

\lambda = \max_{1 \leq k < e} \frac{\deg w_k + 1}{k} = \max_{1 \leq k < e} \frac{k + 1}{k} = 2, \quad \Longrightarrow \quad \deg_u f_{p, e}(u) \leq 2 (e - 1).

However, prabowo hints that: when $p$ is prime and $1 < k < p$ , $p$ divides $\stirlingI{p}{k}$ .

We first consider the case of $e < p$ , all relevant $\stirlingI{p}{k+1}$ is a multiple of $p$ , and we can extract the common factor:

\begin{aligned} f_{p, e} & = \prod_{i=0}^{u-1} \left( 1 + \sum_{k=1}^{e-1}\frac{\stirlingI{p}{k+1}}{\stirlingI{p}{1}} (ip)^k \right) \mod p^e \\ & = \prod_{i=0}^{u-1} \left( 1 + \sum_{k=1}^{e-1}\frac{\stirlingI{p}{k+1} / p}{\stirlingI{p}{1}} i^k p^{k+1} \right) \mod p^e. \end{aligned}

Hence, we use $w_k(i) = \frac{\stirlingI{p}{k} / p}{\stirlingI{p}{1}} i^{k-1}$ for $k > 1$ and $w_1(i) = 0$ . The $\lambda$ here can be made smaller:

\lambda = \max_{1 \leq k < e} \frac{\deg w_k + 1}{k} = 1, \quad \Longrightarrow \quad \deg_u f_{p, e}(u) \leq e - 1.

(It’s not rigorous here as we can’t divide a number by $p$ under $\bmod p^e$ , although we may do it in $\mathbb{R}$ . The division here is only used for proof and will not be use in code.)

The other case is $e \geq p$ , and we have one more Stirling number to discuss: $\stirlingI{p}{p} = 1$ . We may re-organize the equations:

\begin{aligned} f_{p, e} & = \prod_{i=0}^{u-1} \left( 1 + \sum_{k=1}^{e-1}\frac{\stirlingI{p}{k+1}}{\stirlingI{p}{1}} (ip)^k \right) \mod p^e \\ & = \prod_{i=0}^{u-1} \left( 1 + \sum_{k=1}^{p-2}\frac{\stirlingI{p}{k + 1} / p}{\stirlingI{p}{1}} i^k p^{k+1} + \frac{1}{\stirlingI{p}{1}} (ip)^{p-1} \right) \mod p^e. \end{aligned}

Hence we define for $1 < k < p$ ， $w_k(i) = \frac{\stirlingI{p}{k} / p}{\stirlingI{p}{1}} i^k$ ，and $w_{p-1} = \frac{\stirlingI{p}{p-1}/p}{\stirlingI{p}{1}} {i^{p-2}}+ \frac{1}{\stirlingI{p}{1}} i^{p-1}$ . In this case, $\lambda$ is

\lambda = \max_{1 \leq k < e} \frac{\deg w_k + 1}{k} = \frac{p}{p - 1}, \quad \Longrightarrow \quad \deg_u f_{p, e}(u) \leq \frac{p}{p - 1} (e - 1).

As $p \geq 2$ , this bound is always better than $2(e-1)$ .

Overall, we have

\deg_u f_{p, e}(u) \leq d^* := \left\lfloor \frac{p (e - 1)}{p-1} \right \rfloor.

How nicely it captures both cases!

The divisibility of the Stirling numbers

The proof above makes use of a special property of Stirling numbers of the first kind, which we’ll give a short proof here:

For prime $p$ and $1 < k < p$ , $p$ divides $\stirlingI{p}{k}$ .

Proof: Consider a polynomial $f(x) = \prod\limits_{i=1}^{p-1} (x+i) \in \mathbb{F}_p[x]$ . The Stirling numbers of the first kind satisfy

f(x) = \prod_{i=1}^{p-1} (x + i) = \sum_{i=1}^p \stirlingI{p}{i} x^{i-1}.

Note that $f(0) = (p-1)! \equiv -1 \pmod{p}$ , as well as $f(x) = 0$ for any $x \neq 0$ . On the other hand, the polynomial $g(x) = x^{p-1} - 1$ has identical values as $f$ in these $p$ points, and both $f$ and $g$ have degree of $p - 1$ , thus $f = g$ . Now we compare the coefficients of $f$ and $g$ :

\stirlingI{p}{i} \equiv \begin{cases} 1, & i = p, \\ 0, & 1 < i < p, \\ -1, & i = 1.\end{cases} \pmod{p}

Algorithm for computing $(n!)_p \bmod p^e$

For $0 \leq v \leq p, 0 \leq k \leq e$ , preprocess $\stirlingI{v}{k}$ with overall complexity $O(pe)$ ;
Preprocess $f_{p, e}$ . As a polynomial with degree of $d^*$ requires $d^* + 1$ points to determine, we may simply brute-force $f_{p, e}(u)$ for $0 \leq u \leq d^*$ and use Lagrangian interpolation. The complexity is $O(e^2)$ .
For each query, we simply use the following equation: $(n!)_p \equiv \stirlingI{p}{1}^u f_{p, e}(u) \left( \sum_{k=1}^{e} \stirlingI{v+1}{k} (up)^{k-1} \right) \pmod{p^e},$ with time complexity of $O(e)$ per query.

If $d^* \geq p$ , we may be careful in Lagrange interpolation. The following equation can be helpful,

f_{p, e}(u) = \sum_{i=0}^{d^*} f_{p, e}(i) \prod_{j \neq i} \frac{u - j}{i - j} = \sum_{i=0}^{d^*} f_{p, e}(i) \binom{u}{j}\binom{u-j-1}{i-j} (-1)^{i-j},

which means that it must be an integer. However, $\prod_{j \neq i} \frac{u - j}{i - j}$ might be a little tricky as $i - j$ might not be invertible under $\bmod{p^e}$ … Anyway we have to represent all numbers in the form of $r \equiv \frac{n}{p^a} \pmod {p^e}$ and do multiplication/division later…

For reference code, prabowo provides a C++ version while I can share my own Python version. Unfortunately, min_25’s Python version was lost.

Hexo

When I wrote the blog post for the first time, I use commands like \newcommand{\aaa}[1]{#1} while Hexo recognize # as SwigTags, which leads to a rendering issue. The solution is quite simple: just add

disableNunjucks: true

to the front-matter. Please refer to this issue (there is a typo in disableNunjucks in the comment though) and this issue.

Analysis on n!n!n!

Prove fp,e(u)f_{p, e}(u)fp,e​(u) is polynomial in uuu

Bound the degree of fp,e(u)f_{p, e}(u)fp,e​(u)