BDD-ALGORITHM

summary of the BDD algorithm in ACL2
```Major Section:  BDD
```

The BDD algorithm in ACL2 uses a combination of manipulation of `IF` terms and unconditional rewriting. In this discussion we begin with some relevant mathematical theory. This is followed by a description of how ACL2 does BDDs, including concluding discussions of soundness, completeness, and efficiency.

We recommend that you read the other documentation about BDDs in ACL2 before reading the rather technical material that follows. See BDD.

Here is an outline of our presentation. Readers who want a user perspective, without undue mathematical theory, may wish to skip to Part (B), referring to Part (A) only on occasion if necessary.

(A) Mathematical Considerations

(A1) BDD term order

(A2) BDD-constructors and BDD terms, and their connection with aborting the BDD algorithm

(A3) Canonical BDD terms

(A4) A theorem stating the equivalence of provable and syntactic equality for canonical BDD terms

(B) Algorithmic Considerations

(B1) BDD rules (rules used by the rewriting portion of the ACL2 BDD algorithm)

(B2) Terms ``known to be Boolean''

(B3) An ``IF-lifting'' operation used by the algorithm, as well as an iterative version of that operation

(B4) The ACL2 BDD algorithm

(B5) Soundness and Completeness of the ACL2 BDD algorithm

(B6) Efficiency considerations

(A) Mathematical Considerations

(A1) BDD term order

Our BDD algorithm creates a total ``BDD term order'' on ACL2 terms, on the fly. We use this order in our discussions below of IF-lifting and of canonical BDD terms, and in the algorithm's use of commutativity. The particular order is unimportant, except that we guarantee (for purposes of commutative functions) that constants are smaller in this order than non-constants.

(A2) BDD-constructors (assumed to be `'(cons)`) and BDD terms

We take as given a list of function symbols that we call the ``BDD-constructors.'' By default, the only BDD-constructor is `cons`, although it is legal to specify any list of function symbols as the BDD-constructors, either by using the acl2-defaults-table (see acl2-defaults-table) or by supplying a `:BDD-CONSTRUCTORS` hint (see hints). Warning: this capability is largely untested and may produce undesirable results. Henceforth, except when explicitly stated to the contrary, we assume that BDD-constructors is `'(cons)`.

Roughly speaking, a BDD term is the sort of term produced by our BDD algorithm, namely a tree with all `cons` nodes lying above all non-`CONS` nodes. More formally, a term is said to be a BDD term if it contains no subterm of either of the following forms, where `f` is not `CONS`.

```(f ... (CONS ...) ...)

(f ... 'x ...)  ; where (consp x) = t
```

We will see that whenever the BDD algorithm attempts to create a term that is not a BDD term, it aborts instead. Thus, whenever the algorithm completes without aborting, it creates a BDD term.

(A3) Canonical BDD terms

We can strengthen the notion of ``BDD term'' to a notion of ``canonical BDD term'' by imposing the following additional requirements, for every subterm of the form `(IF x y z)`:

(a) `x` is a variable, and it precedes (in the BDD term order) every variable occurring in `y` or `z`;

(b) `y` and `z` are syntactically distinct; and,

(c) it is not the case that `y` is `t` and `z` is `nil`.

We claim that it follows easily from our description of the BDD algorithm that every term it creates is a canonical BDD term, assuming that the variables occurring in all such terms are treated by the algorithm as being Boolean (see (B2) below) and that the terms contain no function symbols other than `IF` and `CONS`. Thus, under those assumptions the following theorem shows that the BDD algorithm never creates distinct terms that are provably equal, a property that is useful for completeness and efficiency (as we explain in (B5) and (B6) below).

(A4) Provably equal canonical BDD terms are identical

We believe that the following theorem and proof are routine extensions of a standard result and proof to terms that allow calls of `CONS`.

Theorem. Suppose that `t1` and `t2` are canonical BDD terms that contain no function symbols other than `IF` and `CONS`. Also suppose that `(EQUAL t1 t2)` is a theorem. Then `t1` and `t2` are syntactically identical.

Proof of theorem: By induction on the total number of symbols occurring in these two terms. First suppose that at least one term is a variable; without loss of generality let it be `t1`. We must prove that `t2` is syntactically the same as `t1`. Now it is clearly consistent that `(EQUAL t1 t2)` is false if `t2` is a call of `CONS` (to see this, simply let `t1` be an value that is not a `CONSP`). Similarly, `t2` cannot be a constant or a variable other than `t1`. The remaining possibility to rule out is that `t2` is of the form `(IF t3 t4 t5)`, since by assumption its function symbol must be `IF` or `CONS` and we have already handled the latter case. Since `t2` is canonical, we know that `t3` is a variable. Since `(EQUAL t1 t2)` is provable, i.e.,

```(EQUAL t1 (if t3 t4 t5))
```
is provable, it follows that we may substitute either `t` or `nil` for `t3` into this equality to obtain two new provable equalities. First, suppose that `t1` and `t3` are distinct variables. Then these substitutions show that `t1` is provably equal to both `t4` and `t5` (since `t3` does not occur in `t4` or `t5` by property (a) above, as `t2` is canonical), and hence `t4` and `t5` are provably equal to each other, which implies by the inductive hypothesis that they are the same term -- and this contradicts the assumption that `t2` is canonical (property (b)). Therefore `t1` and `t3` are the same variable, i.e., the equality displayed above is actually `(EQUAL t1 (if t1 t4 t5))`. Substituting `t` and then `nil` for `t1` into this provable equality lets us prove `(EQUAL t t4)` and `(EQUAL nil t5)`, which by the inductive hypothesis implies that `t4` is (syntactically) the term `t` and `t5` is `nil`. That is, `t2` is `(IF t1 t nil)`, which contradicts the assumption that `t2` is canonical (property (c)).

Next, suppose that at least one term is a call of `IF`. Our first observation is that the other term is also a call of `IF`. For if the other is a call of `CONS`, then they cannot be provably equal, because the former has no function symbols other than `IF` and hence is Boolean when all its variables are assigned Boolean values. Also, if the other is a constant, then both branches of the `IF` term are provably equal to that constant and hence these branches are syntactically identical by the inductive hypothesis, contradicting property (b). Hence, we may assume for this case that both terms are calls of `IF`; let us write them as follows.

```t0:  (IF t1 t2 t3)
u0:  (IF u1 u2 u3)
```
Note that `t1` and `u1` are variables, by property (a) of canonical BDD terms. First we claim that `t1` does not strictly precede `u1` in the BDD term order. For suppose `t1` does strictly precede `u1`. Then property (a) of canonical BDD terms guarantees that `t1` does not occur in `u0`. Hence, an argument much like one used above shows that `u0` is provably equal to both `t2` (substituting `t` for `t1`) and `t3` (substituting `nil` for `t1`), and hence `t2` and `t3` are provably equal. That implies that they are identical terms, by the inductive hypothesis, which then contradicts property (b) for `t0`. Similarly, `u1` does not strictly precede `t1` in the BDD term order. Therefore, `t1` and `u1` are the same variable. By substituting `t` for this variable we see that `t2` and `u2` are provably equal, and hence they are equal by the inductive hypothesis. Similarly, by substituting `nil` for `t1` (and `u1`) we see that `t3` and `u3` are provably, hence syntactically, equal.

We have covered all cases in which at least one term is a variable or at least one term is a call of `IF`. If both terms are constants, then provable and syntactic equality are clearly equivalent. Finally, then, we may assume that one term is a call of `CONS` and the other is a constant or a call of `CONS`. The constant case is similar to the `CONS` case if the constant is a `CONSP`, so we omit it; while if the constant is not a `CONSP` then it is not provably equal to a call of `CONS`; in fact it is provably not equal!

So, we are left with a final case, in which canonical BDD terms `(CONS t1 t2)` and `(CONS u1 u2)` are provably equal, and we want to show that `t1` and `u1` are syntactically equal as are `t2` and `u2`. These conclusions are easy consequences of the inductive hypothesis, since the ACL2 axiom `CONS-EQUAL` (which you can inspect using `:``PE`) shows that equality of the given terms is equivalent to the conjunction of `(EQUAL t1 t2)` and `(EQUAL u1 u2)`. Q.E.D.

(B) Algorithmic Considerations

(B1) BDD rules

A rule of class `:``rewrite` (see rule-classes) is said to be a ``BDD rewrite rule'' if and only if it satisfies the following criteria. (1) The rule is enabled. (2) Its equivalence relation is `equal`. (3) It has no hypotheses. (4) Its `:``loop-stopper` field is `nil`, i.e., it is not a permutative rule. (5) All variables occurring in the rule occur in its left-hand side (i.e., there are no ``free variables''; see rewrite). A rule of class `:``definition` (see rule-classes) is said to be a ``BDD definition rule'' if it satisfies all the criteria above (except (4), which does not apply), and moreover the top function symbol of the left-hand side was not recursively (or mutually recursively) defined. Technical point: Note that this additional criterion is independent of whether or not the indicated function symbol actually occurs in the right-hand side of the rule.

Both BDD rewrite rules and BDD definition rules are said to be ``BDD rules.''

(B2) Terms ''known to be Boolean''

We apply the BDD algorithm in the context of a top-level goal to prove, namely, the goal at which the `:BDD` hint is attached. As we run the BDD algorithm, we allow ourselves to say that a set of terms is ``known to be Boolean'' if we can verify that the goal is provable from the assumption that at least one of the terms is not Boolean. Equivalently, we allow ourselves to say that a set of terms is ``known to be Boolean'' if we can verify that the original goal is provably equivalent to the assertion that if all terms in the set are Boolean, then the goal holds. The notion ``known to be Boolean'' is conservative in the sense that there are generally sets of terms for which the above equivalent criteria hold and yet the sets of terms are not noted as as being ``known to be Boolean.'' However, ACL2 uses a number of tricks, including type-set reasoning and analysis of the structure of the top-level goal, to attempt to establish that a sufficiently inclusive set of terms is known to be Boolean.

From a practical standpoint, the algorithm determines a set of terms known to be Boolean; we allow ourselves to say that each term in this set is ``known to be Boolean.'' The algorithm assumes that these terms are indeed Boolean, and can make use of that assumption. For example, if `t1` is known to be Boolean then the algorithm simplifies `(IF t1 t nil)` to `t1`; see (iv) in the discussion immediately below.

(B3) IF-lifting and the IF-lifting-for-IF loop

Suppose that one has a term of the form `(f ... (IF test x y) ...)`, where `f` is a function symbol other than `CONS`. Then we say that ``IF-lifting'' `test` ``from'' this term produces the following term, which is provably equal to the given term.

```(if test
(f ... x ...)  ; resulting true branch
(f ... y ...)) ; resulting false branch
```
Here, we replace each argument of `f` of the form `(IF test .. ..)`, for the same `test`, in the same way. In this case we say that ``IF-lifting applies to'' the given term, ``yielding the test'' `test` and with the ``resulting two branches'' displayed above. Whenever we apply IF-lifting, we do so for the available `test` that is least in the BDD term order (see (A1) above).

We consider arguments `v` of `f` that are ``known to be Boolean'' (see above) to be replaced by `(IF v t nil)` for the purposes of IF-lifting, i.e., before IF-lifting is applied.

There is one special case, however, for IF-lifting. Suppose that the given term is of the form `(IF v y z)` where `v` is a variable and is the test to be lifted out (i.e., it is least in the BDD term order among the potential tests). Moroever, suppose that neither `y` nor `z` is of the form `(IF v W1 W2)` for that same `v`. Then IF-lifting does not apply to the given term.

We may now describe the IF-lifting-for-IF loop, which applies to terms of the form `(IF test tbr fbr)` where the algorithm has already produced `test`, `tbr`, and `fbr`. First, if `test` is `nil` then we return `fbr`, while if `test` is a non-`nil` constant or a call of `CONS` then we return `tbr`. Otherwise, we see if IF-lifting applies. If IF-lifting does not apply, then we return `(IF test tbr fbr)`. Otherwise, we apply IF-lifting to obtain a term of the form `(IF x y z)`, by lifting out the appropriate test. Now we recursively apply the IF-lifting-for-IF loop to the term `(IF x y z)`, unless any of the following special cases apply.

(i) If `y` and `z` are the same term, then return `y`.

(ii) Otherwise, if `x` and `z` are the same term, then replace `z` by `nil` before recursively applying IF-lifting-for-IF.

(iii) Otherwise, if `x` and `y` are the same term and `y` is known to be Boolean, then replace `y` by `t` before recursively applying IF-lifting-for-IF.

(iv) If `z` is `nil` and either `x` and `y` are the same term or `x` is ``known to be Boolean'' and `y` is `t`, then return `x`.

NOTE: When a variable `x` is known to be Boolean, it is easy to see that the form `(IF x t nil)` is always reduced to `x` by this algorithm.

(B4) The ACL2 BDD algorithm

We are now ready to present the BDD algorithm for ACL2. It is given an ACL2 term, `x`, as well as an association list `va` that maps variables to terms, including all variables occurring in `x`. We maintain the invariant that whenever a variable is mapped by `va` to a term, that term has already been constructed by the algorithm, except: initially `va` maps every variable occurring in the top-level term to itself. The algorithm proceeds as follows. We implicitly ordain that whenever the BDD algorithm attempts to create a term that is not a BDD term (as defined above in (A2)), it aborts instead. Thus, whenever the algorithm completes without aborting, it creates a BDD term.

If `x` is a variable, return the result of looking it up in `va`.

If `x` is a constant, return `x`.

If `x` is of the form `(IF test tbr fbr)`, then first run the algorithm on `test` with the given `va` to obtain `test'`. If `test'` is `nil`, then return the result `fbr'` of running the algorithm on `fbr` with the given `va`. If `test'` is a constant other than `nil`, or is a call of `CONS`, then return the result `tbr'` of running the algorithm on `tbr` with the given `va`. If `tbr` is identical to `fbr`, return `tbr`. Otherwise, return the result of applying the IF-lifting-for-IF loop (described above) to the term `(IF test' tbr' fbr')`.

If `x` is of the form `(IF* test tbr fbr)`, then compute the result exactly as though `IF` were used rather than `IF*`, except that if `test'` is not a constant or a call of `CONS` (see paragraph above), then abort the BDD computation. Informally, the tests of `IF*` terms are expected to ``resolve.'' NOTE: This description shows how `IF*` can be used to implement conditional rewriting in the BDD algorithm.

If `x` is a `LAMBDA` expression `((LAMBDA vars body) . args)` (which often corresponds to a `LET` term; see let), then first form an alist `va'` by binding each `v` in `vars` to the result of running the algorithm on the corresponding member of `args`, with the current alist `va`. Then, return the result of the algorithm on `body` in the alist `va'`.

Otherwise, `x` is of the form `(f x1 x2 ... xn)`, where `f` is a function symbol other than `IF` or `IF*`. In that case, let `xi'` be the result of running the algorithm on `xi`, for `i` from 1 to `n`, using the given alist `va`. First there are a few special cases. If `f` is `EQUAL` then we return `t` if `x1'` is syntactically identical to `x2'` (where this test is very fast; see (B6) below); we return `x1'` if it is known to be Boolean and `x2'` is `t`; and similarly, we return `x2'` if it is known to be Boolean and `x1'` is `t`. Next, if each `xi'` is a constant and the `:``executable-counterpart` of `f` is enabled, then the result is obtained by computation. Next, if `f` is `BOOLEANP` and `x1'` is known to be Boolean, `t` is returned. Otherwise, we proceed as follows, first possibly swapping the arguments if they are out of (the BDD term) order and if `f` is known to be commutative (see below). If a BDD rewrite rule (as defined above) matches the term `(f x1'... xn')`, then the most recently stored such rule is applied. If there is no such match and `f` is a BDD-constructor, then we return `(f x1'... xn')`. Otherwise, if a BDD definition rule matches this term, then the most recently stored such rule (which will usually be the original definition for most users) is applied. If none of the above applies and neither does IF-lifting, then we return `(f x1'... xn')`. Otherwise we apply IF-lifting to `(f x1'... xn')` to obtain a term `(IF test tbr fbr)`; but we aren't done yet. Rather, we run the BDD algorithm (using the same alist) on `tbr` and `fbr` to obtain terms `tbr'` and `fbr'`, and we return `(IF test tbr' fbr')` unless `tbr'` is syntactically identical to `fbr'`, in which case we return `tbr'`.

When is it the case that, as said above, ```f` is known to be commutative''? This happens when an enabled rewrite rule is of the form `(EQUAL (f X Y) (f Y X))`. Regarding swapping the arguments in that case: recall that we may assume very little about the BDD term order, essentially only that we swap the two arguments when the second is a constant and the first is not, for example, in `(+ x 1)`. Other than that situation, one cannot expect to predict accurately when the arguments of commutative operators will be swapped.

(B5) Soundness and Completeness of the ACL2 BDD algorithm

Roughly speaking, ``soundness'' means that the BDD algorithm should give correct answers, and ``completeness'' means that it should be powerful enough to prove all true facts. Let us make the soundness claim a little more precise, and then we'll address completeness under suitable hypotheses.

Claim (Soundness). If the ACL2 BDD algorithm runs to completion on an input term `t0`, then it produces a result that is provably equal to `t0`.

We leave the proof of this claim to the reader. The basic idea is simply to check that each step of the algorithm preserves the meaning of the term under the bindings in the given alist.

Let us start our discussion of completeness by recalling the theorem proved above in (A4).

Theorem. Suppose that `t1` and `t2` are canonical BDD terms that contain no function symbols other than `IF` and `CONS`. Also suppose that `(EQUAL t1 t2)` is a theorem. Then `t1` and `t2` are syntactically identical.

Below we show how this theorem implies the following completeness property of the ACL2 BDD algorithm. We continue to assume that `CONS` is the only BDD-constructor.

Claim (Completeness). Suppose that `t1` and `t2` are provably equal terms, under the assumption that all their variables are known to be Boolean. Assume further that under this same assumption, top-level runs of the ACL2 BDD algorithm on these terms return terms that contain only the function symbols `IF` and `CONS`. Then the algorithm returns the same term for both `t1` and `t2`, and the algorithm reduces `(EQUAL t1 t2)` to `t`.

Why is this claim true? First, notice that the second part of the conclusion follows immediately from the first, by definition of the algorithm. Next, notice that the terms `u1` and `u2` obtained by running the algorithm on `t1` and `t2`, respectively, are provably equal to `t1` and `t2`, respectively, by the Soundness Claim. It follows that `u1` and `u2` are provably equal to each other. Since these terms contain no function symbols other than `IF` or `CONS`, by hypothesis, the Claim now follows from the Theorem above together with the following lemma.

Lemma. Suppose that the result of running the ACL2 BDD algorithm on a top-level term `t0` is a term `u0` that contains only the function symbols `IF` and `CONS`, where all variables of `t0` are known to be Boolean. Then `u0` is a canonical BDD term.

Proof: left to the reader. Simply follow the definition of the algorithm, with a separate argument for the IF-lifting-for-IF loop.

Finally, let us remark on the assumptions of the Completeness Claim above. The assumption that all variables are known to be Boolean is often true; in fact, the system uses the forward-chaining rule `boolean-listp-forward` (you can see it using `:``pe`) to try to establish this assumption, if your theorem has a form such as the following.

```(let ((x (list x0 x1 ...))
(y (list y0 y1 ...)))
(implies (and (boolean-listp x)
(boolean-listp y))
...))
```
Moreover, the `:BDD` hint can be used to force the prover to abort if it cannot check that the indicated variables are known to be Boolean; see hints.

Finally, consider the effect in practice of the assumption that the terms resulting from application of the algorithm contain calls of `IF` and `CONS` only. Typical use of BDDs in ACL2 takes place in a theory (see theories) in which all relevant non-recursive function symbols are enabled and all recursive function symbols possess enabled BDD rewrite rules that tell them how open up. For example, such a rule may say how to expand on a given function call's argument that has the form `(CONS a x)`, while another may say how to expand when that argument is `nil`). (See for example the rules `append-cons` and `append-nil` in the documentation for `IF*`.) We leave it to future work to formulate a theorem that guarantees that the BDD algorithm produces terms containing calls only of `IF` and `CONS` assuming a suitably ``complete'' collection of rewrite rules.

(B6) Efficiency considerations

Following Bryant's algorithm, we use a graph representation of terms created by the BDD algorithm's computation. This representation enjoys some important properties.

(Time efficiency) The test for syntactic equality of BDD terms is very fast.

(Space efficiency) Equal BDD data structures are stored identically in memory.

Implementation note. The representation actually uses a sort of hash table for BDD terms that is implemented as an ACL2 1-dimensional array. See arrays. In addition, we use a second such hash table to avoid recomputing the result of applying a function symbol to the result of running the algorithm on its arguments. We believe that these uses of hash tables are standard. They are also discussed in Moore's paper on BDDs; see bdd for the reference.