Formalizing Theory of Impartial Games: Sprague-Grundy Theorem

This is a continuation of this previous post covering a formalization of impartial games. In this post, I will define the game of Nim, and use it to state and prove the Sprague-Grundy theorem.

Defining Nim

I chose to define Nim in terms of the BinNat.N type rather than nat. The main reason is that N is a binary representation of nonnegative integers, and thus it is much more natural choice for working with bitwise operations (in our case, we are mainly interested in using N.lxor and related theorems). On the other hand, in some cases, I did find it more useful to work with nat, but I decided it was not worth the hassle to constantly convert to and from the different representations.

The game of Nim is defined as an instance of the impartial_game record:

Definition nums_lt : N -> list N :=
  (N.recursion [] (fun n l => n :: l)).

Definition Nim (n : N) : impartial_game.
  refine {|
          position := N;
          start := n;
          moves := fun x => nums_lt x;
          finite_game := _
        |}.
  (* Proof obligation omitted. *)
Defined.

The type of positions is N, and at each state $s$ , a player can move to any state $s' < s$ . Similar to the lemma we proved for a sum of two games, it is useful to have a lemma that describes what moves are valid:

Corollary nim_moves_spec :
  forall {m} (n n' : N),
    valid_move (Nim m`` n' n <-> n' < n.

Let $N_x$ be the Nim game that starts with $x$ stones. We will show that $N_x \oplus N_y \equiv N_{x \oplus y}$ , i.e., the sum of two Nim games with $x$ and $y$ stones is equivalent to the Nim game with $x \oplus y$ stones. This might seem a bit mysterious, so let's try to gain some intuition to why this might be the case. Although we did not discuss it much in the previous post, there are some simple results about the sum of games that follow from "mirroring" arguments.

For example, the sum of a game with itself (that is, $A \oplus A$ ) is losing and is equivalent to the zero game. This is because if a player makes a move on the left game, the other player can make the same move on the right board, and vice versa. This gives a winning strategy for the second player, because it guarantees that she will always have an available move, and thus she can never lose.

In the game of Nim, this corresponds to the fact that $x \oplus x = 0$ for any $x$ . In addition, the game $N_0$ , which is in bijection with the zero game, serves as the identity for the sum of two games. Adding a zero game to a sum does not change the sum at all, since neither player can make any moves on it.

Let's prove the following lemma, which states a sum of three Nim games with $x, y, z$ stones is losing if and only if $x \oplus y \oplus z = 0$ :

Theorem nim_sum_losing_3 :
  forall x y z,
    losing_state ((Nim x) ~+~ (Nim y) ~+~ (Nim z)) (x, y, z)
      <-> (N.lxor (N.lxor x y) z) = 0.

We prove this by well-founded induction on the moves in the game (Nim x) ~+~ (Nim y) ~+~ (Nim z). Let's focus on the backwards direction first. Suppose $x \oplus y \oplus z = 0$ . We will show that this is a losing state. This is because whatever move we make will cause the xor to be nonzero, and therefore the resulting state will be winning by the inductive hypothesis. Let's assume without loss of generality that our move is $x' < x$ . Then the new xor is

$x' \oplus y \oplus z = x' \oplus x \oplus (x \oplus y \oplus z) = x' \oplus x \neq 0.$

This requires some casework and a proof that $x \oplus y = 0 \iff x = y$ , but otherwise this direction is not too bad.

The forwards direction, however, was substantially more difficult to prove. Here, we assume that the game $N_x \oplus N_y \oplus N_z$ is losing, and we want to show that $x \oplus y \oplus z = 0$ . Equivalently, if $x \oplus y \oplus z = p > 0$ , then we wish to show that $N_x \oplus N_y \oplus N_z$ is winning. Thus we need to find a move that transitions to a losing state. By our inductive hypothesis, it suffices to find a state $(x', y', z')$ such that $x' \oplus y' \oplus z' = 0$ . Let $k$ be the index of the highest set bit in $p$ (i.e., $k = \lfloor \log_2 p \rfloor$ ), and without loss of generality assume the index of the highest bit of $x$ is also $k$ (at least one of $x, y, z$ must have the same highest bit). Then our winning move will be $x \to x \oplus p$ . We can show that

$x \oplus p < x$ , since the $k$ 'th bit of $x \oplus p$ is $0$ , while the $k$ 'th bit of $x$ is $1$ , so $x \oplus p$ is strictly smaller than $x$ . Therefore this is a valid move.
$(x \oplus p) \oplus y \oplus z = x \oplus (x \oplus y \oplus z) \oplus y \oplus z = 0$ . Therefore $(x \oplus p, y, z)$ is a losing state.

Formalizing the proof of this first bullet point turned out to be quite challenging. The main difficulty I faced was that there are not many lemmas that relate the < relation to bitwise operations. The central lemma I ended up needing was the following:

Lemma xor_lt : forall A B,
    (B <> 0) ->
    N.testbit A (N.log2 B) = true ->
    (N.lxor A B < A).

I'll spare you the particulars of the proof, but the high level approach was to use the fact that if N.log2 A is greater than N.log2 B, that means we can subtract off 2^n from both A and N.lxor A B, decreasing the index of the highest one bit. We show that subtracting 2^n does not affect any other bits by relating it to N.ldiff. We can continue until the highest one bit is N.log2 B, at which point we can use N.log2_lt_cancel to finish off the proof.

Finally, we can show that the sum of two Nim games is equivalent to a single Nim game:

Theorem nim_sum_equiv :
  forall x y, (Nim x) ~+~ (Nim y) == Nim (N.lxor x y).
  intros.
  unfold equiv.
  apply nim_sum_losing_3.
  simpl.
  XorSolver.boolgroup_rw N.lxor 0.
Qed.

The nim_sum_equiv proof uses the tactic boolgroup_rw, a custom reflection tactic I created to discharge xor equalities, similar to ring or similar tactics to normalize polynomials. Although interesting, understanding how it works is a bit out of scope for this post. As a reminder you can view the completed development here.

By induction, it follows easily that that the sum of any number of Nim games is equivalent to a single Nim game.

Sprague-Grundy Theorem

Let's consider a modification to the game on Nim. If a game has $s$ stones, in addition to removing some number of stones, one is also allowed to perform "adding moves", i.e., adding some positive number of stones in such a way that the game remains acyclic. One example would be allowing both players to add one stone, but such a move can only be performed twice in total. This description is purposely generic--we aren't concerned with the particular rules of adding stones, as we will show that this has no effect on the original game of Nim.

The reason is another mirroring argument: if one player makes an adding move and transitions to some state $s' > s$ , then the other player can simply make the move $s' \to s$ , reverting back to the original pile. Because the game must be acyclic, at some point, the first player will no longer be able to play any adding moves, and she will have to remove some number of stones like in regular Nim. So, because any adding move can immediately be undone, a winning strategy cannot depend on using any adding moves. Therefore both players' optimal strategy is not affected in this modified version of Nim.

It turns out that any impartial game can be described as a version of Nim with adding moves. Let $moves(s)$ be the set of states that we can transition to from state $s$ . Let's inductively assume that for any state $s' \in moves(s)$ , the state $s$ is equivalent to the Nim game with $g(s')$ stones. We call $g(s')$ the Grundy number or nimber associated with state $s'$ . To define the Grundy number of state $s$ , we would like to pick a number $n$ such that for any number $m < n$ , there is a state $s' \in moves(s)$ with $g(s') = m$ , and there is not state $s' \in moves(s)$ with $g(s') = n$ . This corresponds to fact that in a Nim pile with $n$ stones, we can transition to any state with strictly fewer stones, but we can't transition to a state with the same number of stones.¹

There may other states with $g(s') > n$ , but these correspond to adding moves in Nim, and as we have seen, we aren't concerned with these as they won't affect the outcome of the game. This precisely defines what $n$ should be: the mex (minimum excludant) of the set $\{ g(s') : s' \in moves(s) \}$ . For example, if our set of Grundy numbers we can transition to is $\{ 0, 1, 3, 4, 7\}$ , the mex would be $2$ . This also gives a simple procedure to compute the Grundy number of any state $s$ : first, recursively compute the Grundy of all states $s' \in moves(s)$ , then set

$g(s) = \operatorname{mex}(\{ g(s') : s' \in moves(s) \}).$

The formalization of this section is quite pleasant and straightforward, especially compared to the mess that is dealing with bitwise operations. We define the mex function by first defining its specification, then writing a procedure that loops from $0$ up to the maximum of the list and returns the first number does not appear in the list.

Section Mex.
  Variable l : list N.
  Definition is_mex (m : N) :=
    (forall n, n < m -> In n l) /\ ~ In m l.

 Definition mex_defn : { m | is_mex m }.
 Proof.
   (* Proof omitted. *)
 Defined.

 Definition mex : N := let (m, _) := mex_defn in m.

 Lemma mex_lt : (forall n, n < mex -> In n l).
   unfold mex.
   destruct mex_defn as [? [? ?]].
   auto.
 Qed.

 Lemma mex_neq :  ~ In (mex) l.
   unfold mex.
   destruct mex_defn as [? [? ?]].
   auto.
 Qed.

Next we define the grundy number of a state, which is simple to define using well-founded recursion.

Section Grundy.
  Variable g : impartial_game.

  Definition grundy : position g -> N :=
    Fix (finite_game g)
      (fun _ => N)
      (fun x F => mex (map_In (moves g x) (fun y P => F y P))).

  Lemma grundy_moves_valid :
    forall s s', valid_move g s' s -> In (grundy s') (map grundy (moves g s)).
      intros.
      unfold valid_move in H.
      apply (in_map grundy) in H.
      assumption.
  Qed.

  Lemma grundy_moves : forall s s',
      valid_move g s' s ->
      grundy s' < grundy s \/ grundy s < grundy s'.
    intros.
    destruct (N.lt_trichotomy (grundy s') (grundy s)) as [? | [? | ?]]; auto.
    apply grundy_moves_valid in H.
    rewrite H0 in H.
    rewrite grundy_unfold in H.
    apply mex_neq in H.
    contradiction.
  Qed.

  Lemma grundy_moves_lt : forall n s,
      n < grundy s ->
      exists s', valid_move g s' s /\ grundy s' = n.
    intros.
    rewrite grundy_unfold in H.
    apply mex_lt in H.
    apply in_map_iff in H as [s' [? ?]].
    exists s'; auto.
  Qed.

Finally, we prove the Sprague-Grundy theorem, which is stated as follows:

Definition grundy_game : N := grundy (start g).

Theorem sg_theorem : g == Nim (grundy_game).

The proof does not make explicit reference to a modified game of Nim with adding moves, but it uses the same underlying principles. Recall that we defined that two games are equivalent if their sum is losing. So, we show that the game g ~+~ (Nim (grundy (start g))) is losing. It suffices to show that no matter what move is made, there is a move that transitions into a losing state. Suppose our current state has Grundy number $n$ . Then for any $m < n$ , we can transition to a state with Grundy number $m$ , and we cannot transition to a state with Grundy number $n$ . There are three types of moves available in our game g ~+~ (Nim (grundy (start g))):

If we make a move on the right (Nim) game to a state $m < n$ , we can make a corresponding move on the left game, since $n$ is defined as the mex of all the Grundy numbers we could transition to.
If we make a move on the left game to a state with Grundy number $m < n$ , then we can make a corresponding move on the right game.
If we make a move on the left game to a state with Grundy number $m > n$ , then we can "undo" this move by moving back to a state with Grundy number $n$ .²

We show that the sum of two impartial games is equivalent to a single Nim game. This is a direct corollary of our two main theorems we have proved.

Theorem sg_sum : forall g h, g ~+~ h == Nim (N.lxor (grundy_game g) (grundy_game h)).
  intros.
  rewrite (sg_theorem g) at 1.
  rewrite (sg_theorem h) at 1.
  apply nim_sum_equiv.
Qed.

There is certainly more to say about impartial games and combinatorial game theory in general, but that's all I have planned for now. I hope you enjoyed!

Note that in our modified version of Nim, a player can add or remove stones, but can never leave the pile untouched. Indeed, such a move would break the previous proof, as such a move would not always be able to be undone. ↩
This wouldn't be our original state, since our game must acyclic, but it would be some other state with Grundy number $n$ . ↩