Symmetry-Based Reasoning

The Role of Symmetry in Motivated Proofs

Posted by Matei Mandache on February 17, 2023 · 24 mins read

Introduction

Symmetry is a concept that appears time and time again throughout mathmatics. In this blog post, I make the case for symmetry being useful in automated theorem proving, and the potential need to incorporate symmetry-based reasoning in the definition of a motivated proof. For the purposes of the document, we define a symmetry as follows:

Definition A Symmetry is a pair $(S_1, S_2)$ of subsets of current or previous problem states, together with an assignment of the variables appearing in $S_1$ to expressions involving the variables in $S_2$ such that the assignment transforms $S_1$ into $S_2$.

As always, it is useful to look at examples to get a better idea of what is meant.

Example 1 Suppose we have arbitrary values $x$ and $y$ and our problem state is as follows:

$$\begin{aligned} & x > y \vee y > x \\ \hline & \exists z, (x > z > y) \vee (y > z > x) \end{aligned}$$

Then there is a symmetry where $S_1$ and $S_2$ are the whole problem state, and the assignment of variables maps $x$ to $y$ and $y$ to $x$. As we will see, this symmetry can lead to motivating "without loss of generality" statements.

Example 2 Suppose we are given functions $f, g : \mathbb{N} \to \mathbb{N}$ and positive integers integers $m, n$ with the following problem state:

$$\begin{aligned} & f(m) \neq f(n) \\ & f(m+1) = f(n+1) \\ & \forall k \in \mathbb{N}, f(k+1) = g(f(k)) \\ \hline & \neg \text{Surjective}(f) \end{aligned}$$

At some point later we might have the problem state:

$$\begin{aligned} & f(m) \neq f(n) \\ & f(m+1) = f(n+1) \\ & f(m+2) = f(n+2) \\ & \forall k \in \mathbb{N}, f(k+1) = g(f(k)) \\ \hline & \neg \text{Surjective}(f) \end{aligned}$$

Then we can take $S_1$ to be the assumptions $f(m+1) = f(n+1)$ and $\forall k \in \mathbb{N}, f(k+1) = g(f(k))$ from the first problem state, and $S_2$ to be the assumptions $f(m+2) = f(n+2)$ and $\forall k \in \mathbb{N}, f(k+1) = g(f(k))$ in the second problem state. For our mapping, we send $n$ to $n+1$ and $m$ to $m+1$, and leave $f$ and $g$ fixed. As we will see, this kind of symmetry can lead to motivating proofs by induction.

There are three areas I've identified where symmetries come in helpful:

  • Arguing "without loss of generality"
  • Generalising a lemma
  • Discovering proofs by induction.

The rest of this document is dedicated to exploring these three areas, as well as giving some details on how symmetries may be found algorithmically.

Without loss of generality

The idea is that whenever the prover can split into cases, for example by splitting a disjunctive hypothesis or a conjunctive target, the prover will check if there is a symmetry that means that any proof of one case can be transformed into a proof of the other case. We call such a symmetry an Implication Symmetry. Here's a definition:

Definition An Implication Symmetry is a Symmetry where we take $S_1$ to be one subgoal (including hypotheses) and $S_2$ to be another subgoal (including hypotheses), or another subgoal with some hypotheses removed.

We note that any proof of the subgoal $S_1$ can be turned into a proof of the subgoal $S_2$ by applying the reassignment of variables given by the symmetry. In Example 1, we notice that the splitting of the disjunctive hypothesis produces the state

$$\begin{aligned} & x > y \\ \hline & \exists z, (x > z > y) \vee (y > z > x) \\ & \\ & y > x \\ \hline & \exists z, (x > z > y) \vee (y > z > x) \end{aligned}$$

The two subgoals are related by the symmetry that interchanges $x$ and $y$, so we may without loss of generality prove only the first subgoal.

Note on metavariables

Metavariables can block us from performing this reasoning. For example, consider the following state, where $x$ and $y$ are variables taking values in $\mathbb R$ and $z$ is a metavariable to be chosen later:

$$\begin{aligned} & x \neq 0 \\ \hline & xz = 1 \\ & \\ & y \neq 0 \\ \hline & yz = 1 \end{aligned}$$

We clearly can't always chose $z$ so that both goals are satisfies, even though the subgoals can be satisfied individually. It may seem that the two subgoals correspond to each other via a symmetry, but we can't say "without loss of generality" here. At the moment, the solution to this problem I've found is to ban Implication Symmetries from containing metavariables. One way around this is to avoid creating metavariables unless necessary by keeping the relevant quantifiers present (e.g. maintaining an existential quantifier in front of a target).

Generalising a lemma

In order for the content of this section to be as widely applicable as possible, it helps to work with a more general definition of symmetry

Definition A Predicate-Included Symmetry is a pair $(S_1, S_2)$ of subsets of current or previous problem states, together with an assignment of the variables, constants and predicates appearing in $S_1$ to expressions involving the variables, constants and predicates in $S_2$ such that the assignment transforms $S_1$ into $S_2$.

Note this is the same as the definition of symmetry given previously, only that we include constants and predicates as well as variables. We will see in due course why this is useful. For now, we need to make a couple more definitions.

Definition Given two problem states $P_1$ and $P_2$ such that the problem state $P_1$ is a previous problem state of $P_2$, and a subset $Q_2$ of $P_2$, we define the used propositions of $Q_2$ to be the subset of $P_1$ that was used in the deduction of the statements in $Q_2$, together with any library results that were also used.

Definition For a subset $Q_2$ of a problem state $P_2$ that has a previous problem state $P_1$, we define a Generalising Symmetry to be a Predicate-Included Symmetry that sends the used propositions of $Q_2$ into a subset of $P_1$, or a subset of $P_1$ with library results added. We define the Result of the Generalising Symmetry to be the image of $Q_2$ under the assignment given by the symmetry.

The idea behind this definition is that we can re-run the proof of $Q_2$ from its used propositions after applying the symmetry, to deduce the result. We may therefore modify $P_2$ by replacing $Q_2$ with either:

  • The result of the symmetry
  • The union of $Q_2$ and the result of the symmetry and this would be a sound move.

Now that we've covered the theory, let's look at an example. Suppose we are working in topology, and we have a state with the hypothesis $$ \forall A, \mathbf{Cl}\mathbf{Int}\mathbf{Cl} A \subseteq \mathbf{Cl} A $$

We examine our proof of this hypothesis to see what hypotheses are needed from the initial problem state. These are just the following library results:

  • $\forall A, \mathbf{Int} A \subseteq A$
  • $\forall A, B, A \subseteq B \Rightarrow \mathbf{Cl} A \subseteq \mathbf{Cl} B$
  • $\forall A, \mathbf{Cl} \mathbf{Cl} A = \mathbf{Cl} A$

We notice that the Predicate-Included symmetry interchanging $\mathbf{Cl}$ with $\mathbf{Int}$ and $\subseteq$ with $\supseteq$ transforms them into the following three statements:

  • $\forall A, \mathbf{Cl} A \supseteq A$
  • $\forall A, B, A \supseteq B \Rightarrow \mathbf{Int} A \supseteq \mathbf{Int} B$
  • $\forall A, \mathbf{Int} \mathbf{Int} A = \mathbf{Int} A$

These statements are themselves all in the library (or equivalent to something in the library), and so we can apply the same symmetry to the hypothesis $\forall A, \mathbf{Cl}\mathbf{Int}\mathbf{Cl} A \subseteq \mathbf{Cl} A$ to deduce the Result of the symmetry:

$$ \forall A, \mathbf{Int} \mathbf{Cl} \mathbf{Int} A \supseteq \mathbf{Int} A. $$ We can now add this as an assumption in our state.

Induction

The topic of when proofs by induction can be applied is interesting and intricate. There are different approaches one can take to motivating a proof by induction. One is to say, for example "we're working with integers so there's no harm in trying to prove the statement by induction, as that gives us an extra hypothesis/hypotheses". This is not the approach I will take in this document. My idea is that an algorithm could notice certain kinds of symmetries, and these symmetries indicate when a proof by induction would be fruitful.

With that said, I feel there are two separate kinds of symmetry that can result in a proof by induction. One is associated with forwards reasoning, while the other is associated with backwards reasoning.

Forawards reasoning

The special feature of this kind of induction is that it generates a result that may not appear in the context. Rather, the result proved is a consequence of the symmetry noticed. This is both a positive and a negative: it is good for exploring the space of consequences of the assumed hypotheses, but bad as it may produce a useless result.

Definition A Forward-Induction Symmetry is a non-identity symmetry between subsets of hypotheses $S_1$ of $P_1$ and $S_2$ of $P_2$ such that $P_1$ is a previous problem state of $P_2$ and $S_1$ contains the used hypotheses (but not necessarily the used library results) of $S_2$. We also require the variables appearing in $S_1$ and $S_2$ to be the same. We call the reassignment $T$ of variables in $S_1$ to expressions involving variables in $S_2$ the Mapping of the symmetry. The non-identity condition is tantamount to requiring $T$ to be non-trivial.

The idea behind this definition is that $S_1$ implies $T(S_1)$, or in other words, it implies $S_2$. By applying $T$ we can also deduce that $T(S_1)$ implies $T(T(S_1))$, $T(T(S_1))$ implies $T(T(T(S_1)))$ and so on by induction (this is where we need the variables used to be the same). Then, since we know $S_1$, we can deduce $\forall n \in \mathbb{N}, T^n(S_1)$ by another induction. So we may replace $S_1$ by $\forall n \in \mathbb{N}, T^n(S_1)$ in $P_1$ to strengthen our problem state in a sound manner.

I will now go into an example of this technique in action. Recall our example from earlier: we take $P_1$ to be the problem state

$$\begin{aligned} & f(m) \neq f(n) \\ & f(m+1) = f(n+1) \\ & \forall k \in \mathbb{N}, f(k+1) = g(f(k)) \\ \hline & \neg \text{Surjective}(f) \end{aligned}$$

and $P_2$ to be the problem state

$$\begin{aligned} & f(m) \neq f(n) \\ & f(m+1) = f(n+1) \\ & f(m+2) = f(n+2) \\ & \forall k \in \mathbb{N}, f(k+1) = g(f(k)) \\ \hline & \neg \text{Surjective}(f) \end{aligned}$$

To find an interesting symmetry, we want $S_2$ to contain the hypothesis $f(m+2) = f(n+2)$. This has $f(m+1) = f(n+1)$ and $\forall k \in \mathbb{N}, f(k+1) = g(f(k))$ as used hypotheses, so $S_1$ must contain them (the hypothesis $f(m+2) = f(n+2)$ was obtained by applying $g$ to both sides of $f(m+1) = f(n+1)$ and then using the other hypothesis to simplify). We choose to map $f(m+1) = f(n+1)$ to $f(m+2) = f(n+2)$ under our mapping, which means the mapping $T$ has $T(m) = m+1$ and $T(n) = n+1$. The hypothesis $\forall k \in \mathbb{N}, f(k+1) = g(f(k))$ doesn't include any free variables, so it stays fixed under $T$ and so we include it in $S_2$ (note that this hypothesis appears in both problem states). This is likely to be a common pattern when constructing Forward-Induction Symmetries.

We conclude $\forall k \in \mathbb{N}, T^k(S_1)$ holds, which implies $\forall k \in \mathbb{N}, f(m+k+1) = f(n+k+1)$. Note there is a step here: we matched the iterated "+1" operation with the inductive definition of addition, and deduced that $T^k(f(m+1) = f(n+1)) = (f(m + k + 1) = f(n + k + 1))$. This was quite easy in this case, but in general finding an expression for $T^k(S_1)$ may be more difficult.

Backwards reasoning

The main distinction between this type of induction and the one discussed above is not so much the kind of reasoning used, but rather the fact that the second state encountered is in some sense "smaller" than the first, and so we can prove the target by some form of induction where we assume the smaller cases.

Definition Suppose we have a subgoal (i.e. target with associated hypotheses) $S_1$ of a problem state $P_1$, and in some subsequent state $P_2$ the subgoal $S_1$ has been replaced by a collection of subgoals $S_{2, 1}, S_{2, 2}, \dots, S_{2, n}$. We also specify subsets $S^\prime_{2, i} \subseteq S_{2, i}$ such that each $S^\prime_{2, i}$ is obtained by forgetting some hypotheses. A Backward-Induction Symmetry is a collection of symmetries, one from $S_1$ to each of the $S^\prime_{2, i}$. As before, we define the Mappings $T_i$ to be the reassignment of variables that maps $S_1$ into $S^\prime_{2, i}$.

Given a Backward-Induction Symmetry, let $X$ be the set of all possible combinations of values of the variables appearing in $S_1$. Suppose we have a well-founded partial order relation $<$. It is sufficient to prove that each mapping $T_i$ satisfies the property $$ \forall x \in X, T_i(x) < x. $$ This is because we can then prove the goal $S_1$ using well-founded induction with this order. We have already shown the $S'_{2, i}$ are sufficient to prove $S_1$, and because of the symmetries this means we can prove the induction step.

Hence we may replace the target of $S_{2, i}$ with $T_i(x) < x$ for all $i$, in order to simplify our problem in a sound way.

Note

In the above, we work with an arbitrary well-founded order relation. A mathematician can often spot the order relation that would work for a given problem, but a little thought is required to see how a machine would do this. One approach is to have some standard well-founded order relations in the library, and try a few of them in turn. Another approach would be to treat the order relation as a metavariable.

Example

There is quite a lot going on here, and so it would be useful to look at a couple of examples. The first example is from the proof of the irrationality of $\sqrt{2}$. The problem state $P_1$ would be

$$\begin{aligned} & n > 0 \\ & m > 0 \\ & n^2 = 2m^2 \\ \hline & \text{False} \end{aligned}$$

Where $n$ and $m$ are integers. After a bit of number theory, the state is reduced to the following:

$$\begin{aligned} & n > 0 \\ & m > 0 \\ & k > 0 \\ & n = 2k \\ & n^2 = 2m^2 \\ & m^2 = 2k^2 \\ \hline & \text{False} \end{aligned}$$

In this case we have only one $S_{2, i}$. We take the following subset of $S_{2, 1}$ to be $S'_{2, 1}$:

$$\begin{aligned} & m > 0 \\ & k > 0 \\ & m^2 = 2k^2 \\ \hline & \text{False} \end{aligned}$$

Clearly our mapping $T_1$ shall be defined by $T_1(n) = m$ and $T_1(m) = k$. There are several choices that work for the order. For simplicity we use the order $(a,b) < (a', b') \iff a + b < a' + b'$. It is not hard to see that this is well-founded (we know $a > 0$ and $b > 0$ so it follows from the well-foundedness of the standard order on $\mathbb{N}$). So our problem state can be transformed into the following:

$$\begin{aligned} & n > 0 \\ & m > 0 \\ & k > 0 \\ & n = 2k \\ & n^2 = 2m^2 \\ & m^2 = 2k^2 \\ \hline & m + k < n + m \end{aligned}$$

which is significantly easier than what we started with.

Another example

It is useful to have an example, where there are multiple $S_{2, i}$. Consider the following problem:

$$\begin{aligned} & n \ge 0 \\ & 0 \le m \le n \\ \hline & {n\choose m} \le 2^n \end{aligned}$$

Where $m$ and $n$ are integers. We take $S_1$ as the above state. By splitting into cases according to whether $m=0$, $m=n$, or $0 < m < n$, and solving the trivial cases, our problem reduces to

$$\begin{aligned} & n > 0 \\ & 0 < m < n \\ \hline & {n\choose m} \le 2^n \end{aligned}$$

Applying the identity ${n\choose m} = {n-1\choose m-1} + {n-1\choose m}$ from the library reduces this to

$$\begin{aligned} & n > 0 \\ & 0 < m < n \\ \hline & {n-1\choose m-1} + {n-1\choose m} \le 2^n \end{aligned}$$

It is now sufficient to bound ${n-1\choose m-1}$ and ${n-1\choose m}$ separately. One could imagine an algorithm introducing metavariables for the bounds to be proved, which are then both instantiated to $2^{n-1}$ for symmetry reasons (i.e. we want to abtain states with a Backward-Induction Symmetry). After basic manipulations, we get $S_{2, 1}$ and $S_{2, 2}$ as follows:

$$\begin{aligned} & n-1 \ge 0 \\ & 0 \le m-1 \le n-1 \\ \hline & {n-1\choose m-1}\le 2^{n-1} \end{aligned}$$

and

$$\begin{aligned} & n-1 \ge 0 \\ & 0 \le m \le n-1 \\ \hline & {n-1\choose m} \le 2^{n-1}. \end{aligned}$$

We now have a Backward-Induction Symmetry with the following mappings: $T_1(n) = n-1$, $T_1(m) = m-1$, $T_2(n) = n-1$ and $T_2(m) = m$. Like before, there are several well-founded orders that work. We could take for instance the lexicographical order. The proof then follows immediately from the fact that $n-1 < n$

Algorithm for finding symmetries

Suppose we want to find symmetries with a given subset of the state taking the role of $S_1$, and the set $S_2$ contained in some collection $T_2$ of goals and associated hypotheses. We find symmetries via a recursive algorithm, written in pseudo-code below.

find_symmetries(S_1, T_2) := find_symmetries_aux(S_1, T_2, {})
find_symmetries_aux(S, T, partial_symmetry) :=
if empty(S) return [partial_symmetry] else
    a <- get_proposition(S),
    matches <- get_matches(a, T),
    symmetries <- matches.mmap (lambda match,
        if compatible(match, partial_symmetry) then
            find_symmetries_aux(S\{a}, T, merge(match, partial_symmetry))
        else
            return [])
    return concat(symmetries)

The algorithm returns a list of all possible assignments of variables which constitute a symmetry from $S_1$ into a subset of $T_2$

The get_proposition function selects a proposition in $S$. This can be either a target or a hypothesis. The algorithm doesn't specify how this should be chosen, so we could imagine different variants of the algorithm choosing propositions in different orders. Heuristics could be included here to improve performance.

The get_matches function searches through the syntax trees of the propositions in $T$ which could match $a$, by checking if the syntax tree of $a$ is contained in the syntax tree of an element of $T$, where the containment is a containment of rooted trees. For example, the expression $x + yz$ would match the expression $c + bc^5$. Once this containment is found, it induces a mapping of variables that appear in $a$ into expressions involving variables that appear in $T$. In our example, the mapping could send $x$ to $c$, $y$ to $b$ and $z$ to $c^5$. The match only counts if the resulting mapping is well-defined.

The compatible function checks if the mapping given by a match is compatible with a given partial symmetry. If they are compatible, we can then call the merge function to create a new partial symmetry by adding the mapping given by the match to the previous partial symmetry.

Finally, the concat function turns a list of lists into a single list in the obvious way.