Formal Verification of Asynchronous Data-Path Circuits

by

David T. Weih

B.Sc., University of British Columbia, 1992

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
Master of Science
in
THE FACULTY OF GRADUATE STUDIES
(Department of Computer Science)

We accept this thesis as conforming
to the required standard

The University of British Columbia
April 1996
© David T. Weih, 1996
In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.

Department of Computer Science

The University of British Columbia
Vancouver, Canada

Date April 26, 1996
Abstract

Asynchronous designs are typically modelled with non-deterministic next-state relations. When a deterministic model is available, specialised verification techniques can be used to automatically verify relatively large designs. This thesis demonstrates that verification techniques developed for deterministic models can be applied to speed-independent self-timed circuits. We introduce local formulas, and show that the validity of a local formula is independent of the order in which the components of a speed-independent circuit perform their operations. Local formulas provide a natural way to specify the input/output behaviour of a circuit and are well-suited for specifying data-paths. We demonstrate the practicality of our approach by describing the verification of three designs: a FIFO, an adder, and a vector multiplier chip.
Contents

Abstract ii

Contents iii

List of Figures v

Acknowledgements vi

Dedication viii

1 Introduction 1

1.1 Formal Verification Techniques 2

1.1.1 Theorem Proving 3

1.1.2 OBDD Based Model Checking 4

1.1.3 Symbolic Trajectory Evaluation 5

1.1.4 The Voss Prover 6

1.2 Automatic Verification of Asynchronous Data-Path Circuits 7

1.3 Related Work 8

1.4 Overview of Thesis 10
# 2 Theory

2.1 The Circuit Model ........................................... 12
2.2 Definitions .................................................. 16
2.3 A Specification for the FIFO ............................... 23
2.4 Properties of Speed-Independent Circuits .................. 25
2.5 Discussion of Hypotheses ................................... 30

# 3 Local and Global Formulas

3.1 Local Formulas .............................................. 33
3.2 Global Formulas and Period Functions ....................... 35
3.3 Unspecifiable Behaviour .................................... 39

# 4 Implementation

4.1 VHDL Circuit Descriptions .................................. 40
4.2 Symbolic Trajectory Evaluation ............................. 42
4.3 Model Checking for Speed-Independence .................... 46

# 5 Vector Multiplier

# 6 Conclusions

6.1 Future Work ................................................. 54

# Bibliography

# Appendix A Proofs

# Appendix B Source Code
List of Figures

2.1 A Self-Timed FIFO .................................................. 13
2.2 Splicing Traces ...................................................... 21

3.1 4 Bit Adders ......................................................... 33
3.2 Adder Environment .................................................. 34
3.3 Pipelined Asynchronous Adder ..................................... 37
3.4 Global Formula for Pipelined Adder ............................... 38
3.5 Global Formula for Flat Adder ..................................... 38

4.1 VHDL Code for Source Component .................................. 43
4.2 VHDL Code for Sink Component ................................... 44
4.3 VHDL Code for FIFO Cell .......................................... 45

5.1 Bit Slice of the MAS Block ......................................... 50

6.1 Vector Multiplier: Abstract State Machine ........................ 55
Acknowledgements

I owe a great debt to Mark Greenstreet, who was a tremendously helpful supervisor. Some of the material in this thesis was prepared in collaboration with Mark as a paper for *IEE Proceedings - Computer and Digital Techniques*. The theoretical framework presented in Chapter 2 owes its clarity to his suggestions.

I would also like to acknowledge the ISD group for making hardware verification a lively and social pursuit. In particular, Mike Donat and Mark Aagaard let me practice explaining my ideas to them, while Andy Martin essentially took a two day holiday from his own work to help me prove a theorem.

Finally, I would like to thank my family and other friends who supported me in many ways during my thesis. My parents and grandparents encouraged me and provided some very practical support. Mike Wilson and Jen Weih gave me with the opportunity to occasionally escape from work. And last but not least, deep thanks to Andrea, without whom nothing would have been the same.

David T. Weih
To my beloved wife-to-be,

Andrea Wadman
Chapter 1

Introduction

Many techniques have been developed for formal hardware verification. There has been a need for this variety because each method has strengths and weaknesses which restrict it to a particular subclass of verification tasks. The verification method described in this thesis has capabilities not found in existing techniques. In particular, this method allows automatic verification of asynchronous data-path circuits.

Synchronous and asynchronous designs differ primarily in the timing models used to describe their behaviour. In a pure synchronous design, computation is performed by combinational logic and storage is performed by clocked latches. In this framework, the state of the latches after a clock event is a deterministic function of the values held in the latches before the clock event. Specialised verification methods, such as symbolic trajectory evaluation (STE), exploit the fact that synchronous designs have deterministic next state functions which allow a symbolic "next state" operator to be implemented efficiently. This allows large synchronous designs to be formally verified with a high degree of automation [Bea93, Dar94].

Asynchronous designs, on the other hand, do not use a clock to determine the
timing of state changing operations. Because the timing relations between circuit components cannot be completely determined, asynchronous designs are usually modelled using non-deterministic next-state relations. Symbolic model checkers have been used to verify small designs; however, the state-space explosion problem prevents fully automated verification on the same scale that has been achieved for synchronous designs.

This thesis presents a formal framework in which asynchronous circuits can be verified using synchronous models. The key observation is that for speed-independent circuits without output choice the sequence of operations performed by any component depends only on the original state of the circuit and any input values applied. The exact relative ordering of component actions doesn’t matter. This motivates specifying the desired behaviour in terms of input and output sequences for each component, which is a natural specification for data-path circuits. Because the synchronous implementation of the circuit corresponds to choosing a specific ordering for component actions, verifying the input/output sequences of the components in a synchronous model guarantees that the asynchronous circuit will produce equivalent sequences.

1.1 Formal Verification Techniques

In order to set this work in context, a brief overview of the major categories of verification techniques is provided. Much of the information contained in this section is drawn from [Gup92] and [Seg92]. Most methods can be divided into two categories, theorem proving and model checking. Because of their relevance to this work, two particular methods are included in this discussion: symbolic trajectory evaluation (STE) which is based on an efficient OBDD-based model checker; and
the Voss Prover which combines STE with a theorem proving framework.

1.1.1 Theorem Proving

The process of verification by theorem proving typically proceeds by describing both the specification and the structure of the hardware in terms of formal logic, and then formally showing a relationship (such as implication or equivalence) between them. Automated theorem provers (such as HOL [MG93]) have been developed to assist the creation of the required proof. In order to grasp the size of this task, it is important to realize that a formal proof consists of a finite sequence of statements, where each statement is either one of the fundamental axioms provided by the theorem prover or can be derived from the previous statements by a primitive rule of inference. The collection of these axioms and inference rules is typically quite small (5 axioms and 8 inference rules for HOL) requiring a very large number of steps in order to prove theorems. For example, a proof showing that the combination of a NAND gate and an inverter is equivalent to an AND gate takes a sequence of 365 statements.

Also, while these theorem provers are partially automated, in order to prove theorems of any size, human guidance is required. This interaction may be required approximately once for every 1000 proof steps. Despite the large multiplier, we can see from the number of lines required for even a trivial example that considerable human assistance is required for larger proofs. A further drawback is that providing this assistance requires that users possess a high level of training in formal logic.

However, theorem provers are extremely powerful, and there are many examples of their use in verifying both hardware and software systems [OR+95]. Theorem proving has two main advantages which help make it a powerful technique. First, being based on a general mathematical logic (e.g. higher order logic in HOL), theo-
rem provers can be used in a wide variety of different situations. Second, it is well suited to hierarchical methods. This is extremely important when dealing with large systems and systems (such as hardware) which contain various levels with different levels of detail (for example, verifying an arithmetic unit made up of functional blocks which are made up of gates).

1.1.2 OBDD Based Model Checking

Model checking is a technique which has strengths and weaknesses that are almost exactly the reverse of those of theorem proving. Model checking proceeds by describing the system as a finite state machine. That is, for a system with \( d \) boolean valued signals it creates the state space \( 2^d \) and a state transition relation \( R \subseteq 2^d \times 2^d \).

Given an initial state predicate, \( Q_0 \), and a desired property, \( p \), verification proceeds by exhaustively searching the states reachable from \( Q_0 \) under \( R \) for states where \( p \) is false. Thus, it provides a verification of the behaviour of the system described by the model provided, as opposed to theorem proving which shows functional or structural equivalence between the specification and implementation, both of which are described in the same language. One advantage of the model checking approach is that a single specification may be used to verify different implementations with the same intended functionality simply by repeating the decision procedure. This can be contrasted with theorem proving where the proof will typically need to be reformulated to match the new implementation.

This process typically has a completely automatic decision procedure, allowing users with little understanding of the underlying theory to use the system to verify properties. It can also provide counterexamples to failed assertions, which is a significant aid to designers whose fundamental goal is to create a correct system.
There are significant drawbacks to model checking. The first is that the specifications are restricted to being simple enumerations of the desired properties, and it can be difficult to determine whether the specification completely captures the desired properties of the system. The second problem is the fact that the complete model must be described before the system may be verified. The state is "described" rather than constructed, because symbolic techniques using OBDDs [Bry86] have been developed which represent the model and its transition relation as functions from sets of states to sets of states which allow the verification of systems with large state spaces (for example [BC+94] with more than $10^{120}$ states for a synchronous pipelined ALU and $10^{60}$ states for an asynchronous stack). However, the state space must still be described in some manner, which limits the size of systems that can be verified. However, model checking is well-suited to verifying control circuitry, which will typically have a complex state relation but relatively few states.

1.1.3 Symbolic Trajectory Evaluation

Symbolic trajectory evaluation (STE) is a model checking based technique which traces its roots back to symbolic simulation. The inclusion of OBDDs significantly increased the power of symbolic simulation while retaining the ability to model low-level properties such as bounded delays and charge sharing. To reduce the impact of large internal state, a third logic value $X$ is added to indicate an unknown value for a node. This allows the system state to be placed in a lattice, and efficient algorithms have been developed for verifying large designs in this framework. In this method, the stimulus to the system and the required behaviour must be described in a restricted form of temporal logic. STE is a process for showing that the trace
of states simulated implies the trace of states described by the consequent.

This is a relatively efficient verification method, allowing the verification of large circuits, such as a 32-bit pipelined RISC processor [Dar94]. Like model checking, the specifications are typically behavioural, allowing reuse of specifications for different implementations, and the user does not need to have a complete understanding of the details of the internal model in order to use the tool effectively.

However, to realize efficient symbolic simulation STE depends on the existence of a deterministic next state function. It is also limited by the properties of the underlying OBDDs. For example, there is no variable ordering which allows multiplication of bit vectors to be represented by a diagram of a size which is less than exponential in the number of bits. This is a "hard" limit in the capabilities of STE, since (like model checking) it works directly on the circuits states and does not lend itself to hierarchical methods. Furthermore, the lattice structure which allows efficient verification also restricts the class of formulas which can be checked.

1.1.4 The Voss Prover

The Voss Prover combines the powerful low-level capabilities of STE with a simple theorem prover [HS95]. This allows STE results to be combined together to show the correctness of circuits too large for STE to verify on its own.

Theorem proving is limited by the extremely small steps which it must take in deriving proofs and STE is limited by the OBDDs which form its underlying representation. The Voss Prover solves these problems by using the verification results provided by trajectory evaluation as components which are manipulated to form proofs. A fully programmable script language is provided for writing proofs. This language includes abstract representations for data (such as integers) which
free the proof from the boolean representation provided by trajectory evaluation, and allows human intervention to take place at an extremely high level. As opposed to HOL, where each proof step must either be an axiom or the application of an inference rule, the Voss Prover has specialised rules for composing STE results built in, as well as the ability to include limited domain knowledge. This makes proofs shorter and easier for the operator to understand. An example is the verification of a pipelined IEEE 64-bit floating-point multiplier [AS95].

1.2 Automatic Verification of Asynchronous Data-Path Circuits

The goal of this research was to develop a method for automatically verifying asynchronous data-path circuits. With this in mind, let us review some key properties of the above verification techniques.

Theorem Proving

Despite its strengths, theorem proving is not suitable for our purposes because it is not automatic. There are partially-automated theorem provers, but they still require significant manual effort by users with considerable mathematical sophistication.

Model Checking

Model checking is automatic, but general model checking on its own it is not sufficient because its complexity can be exponential in the size of the circuit being verified. This causes a state-space explosion for data-path circuits which typically have a large number of state-holding nodes.
Symbolic Trajectory Evaluation

This specialised form of model checking takes advantage of a deterministic next-state function to allow efficient verification. It is still automatic, but cannot handle the non-determinism of an asynchronous circuit.

Voss Prover

While the Voss Prover is a powerful extension of STE in many respects, it still does not handle non-deterministic next state relations.

From the above summary we can see that theorem proving does not provide an automatic decision procedure, that we cannot use model checking on its own, because of the large state space required to describe data-path circuits, nor can we simply use trajectory evaluation or the Voss Prover, because the next state relation of an asynchronous circuit is non-deterministic.

For this work model checking was used to show speed-independence, which was used to justify transforming the asynchronous circuits into equivalent synchronous models. Then STE was used to verify the synchronous models. The theoretical justification for this transformation is presented in chapter 2. Although the implementation described here uses OBDD based model checking and STE, the theoretical framework presented could be applied to other methods as well.

1.3 Related Work

This work is motivated by the observation that the current state of a speed-independent circuit depends only on the initial state of the circuit, its inputs, and how many times each component has performed an action. Muller and Bartky [MB59] used similar ideas when they described circuits whose states could be
modelled on a semi-modular lattice. The handling of traces is influenced by Udding's thesis [Udd84], where he classifies systems based on the reordering of traces. While he uses this as a method to describe input and output choice, we use the equivalence of reorderings of traces to simplify verification. Using a synchronous model for an asynchronous circuit can be seen as an extreme example of the reduction of the next state relation described in [Pel93].

The circuit model presented excludes designs with arbiters, and we note that many published examples of asynchronous designs meet this requirement. The Caltech Asynchronous Processor [Mar89a] and the self-timed divider of Williams [Wil91] are other examples of self-timed designs that do not use arbiters. A notable exception to this is the Counter Flow Pipeline Processor [SSM94] in which arbiters are used to coordinate the interactions of adjacent pipeline stages. Another well-known published design which is not tractable by this technique is the micropipeline [Sut89], which combines delay-insensitive control circuitry with a bounded-delay data-path. A more general theoretical framework than the one presented in this paper would be required to verify either of these last two designs.

This work requires that designs be shown to be speed-independent before they can be verified. Here this step is performed using model checking and the composition of specifications [AL90], but there are several design methodologies which ensure that the circuit produced is speed-independent (or delay-insensitive, which is stronger).

The survey [Hau95] provides an overview of design methods for delay-insensitive circuits. Petri net-type methods [Mur89] are used as a basis for many of these methods. One is I-nets [MFR85], which can describe systems which are not delay-insensitive, but where delay-insensitivity can be shown by the Foam
Rubber Wrapper constraint [MFR85]. Another method is Signal Transition Graphs [CLW85], where various levels of restriction can be placed on the graphs in return for more efficient automatic synthesis of a circuit. Finally, Change Diagrams [KK+92] are similar to STGs, but avoid some of the restrictions while still maintaining efficient verification of design properties and circuit synthesis.

Martin has implemented a compilation technique [Mar89b] that uses a language similar to Communicating Sequential Processes and compiles it into speed-independent circuits. Van Berkel [vB92] uses the language Tangram in a similar manner. As opposed to Martin and Van Berkel who create their own speed-independent components as required, Brunvald and Sproull [BS89] use a subset of Occam (a language based on CSP) and provide a delay-insensitive module implementing each of the language constructs. Ebergen [Ebe91] uses traces to describe both the specification and the design, and can automatically reduce the specification to a design built up of small number of basic components (such as wire, fork, C-element, etc.). Lastly, Sparsø and Staunstrup [SS93] describe a method for designing multi-ring circuits from building blocks which are speed-independent by construction. A vector multiplier from their paper is verified as the major example in this thesis. A speed-independent circuit produced by any these techniques is amenable to verification using the method presented in this thesis.

1.4 Overview of Thesis

Chapter 2 describes the theoretical framework and results which justify the use of synchronous verification tools for asynchronous circuits. Chapter 3 provides a detailed discussion of the specification method used. Chapter 4 describes how the process was enacted for the examples, and Chapter 5 presents a large example circuit
which was verified to show the capabilities of this method. Chapter 6 provides conclusions and suggests possible areas for future work.
Chapter 2

Theory

This chapter introduces the theoretical underpinnings of this work. The key theorems are stated in section 2.4 along with sketches of their proofs. Detailed proofs are provided in appendix A. To motivate the theorems, a FIFO is used as a simple example. The circuit model is introduced in section 2.1, and the necessary definitions are laid out in section 2.2. These are used to give the specification for the FIFO in section 2.3. Finally, section 2.5 provides a brief discussion relating the hypotheses required in the theoretical results to circuit design.

2.1 The Circuit Model

We model a circuit as a set of nodes $V$, a state transition relation $R$, a set of circuit components $C$, a set of environment components $E$, and an initial state predicate $Q_0$. Each node corresponds to a boolean-valued signal. A state is an assignment of values to nodes; we write $S$ to denote the circuit’s state space, where $S = 2^V$. For $v$ in $V$ and $s$ in $S$, we write $v(s)$ to denote the value of node $v$ in state $s$. The components provide a decomposition of $V$ and $R$ described shortly.
Figure 2.1 depicts a simple four-stage FIFO that is used as an example throughout this chapter. The FIFO is one-bit wide and uses a dual-rail coding [Sei79], where the signal carries its own completion information. In this case, a single bit $x$ is encoded by a pair of wires $(x.t, x.f)$, with $(T,F)$ representing true, $(F,T)$ representing false, and $(F,F)$ as an extra value empty which is used to separate data values. Thus, the transition from $(F,F)$ to $(T,F)$ encodes the arrival of a true value. The acknowledgement part of the handshake cycle takes place on the request line $r$. This is raised when a stage has passed an empty value, and lowered after it has passed a data value.

This circuit has fifteen nodes, twelve circuit components, and two environment components. We assume that it is initialised to the all-empty state.

$$
V = \{x[i].t, x[i].f, x[i].r \mid i \in \{0...4\}\}
$$

$$
C = \{c1t, clf, n1, c2t, c2f, n2, c3t, c3f, n3, c4t, c4f, n4\}
$$

$$
E = \{src, snk\}
$$

$$
Q_0 = \bigwedge_{i=0...4} \neg x[i].t \land \neg x[i].f \land x[i].r
$$

Each component $c$ in $C \cup E$ has a set of inputs $I_c$ and a set of outputs $O_c$, where $I_c$ and $O_c$ are subsets of $V$. Furthermore, we assume that each node is connected to exactly one output. In other words, the outputs of the components in
The behaviour of component $c$ is given by the state transition relation $R_c$. To simplify the presentation, we exclude vacuous transitions: i.e. if $(s, s')$ is a state transition in $R_c$, then $s \neq s'$. Only outputs of $c$ change in a transition in $R_c$,

$$\forall(s, s') \in R_c. \forall v \in V. (v \in O_c) \lor (v(s) = v(s'))$$

Furthermore, the behaviour of a component depends only on the value of its inputs and outputs. If $s^1$ and $s^2$ are two states and $c$ is a component such that all inputs and outputs of $c$ have the same value in both $s^1$ and $s^2$, then if $c$ can perform an action in $s^1$ it can perform an equivalent action from $s^2$. More formally,

$$\forall s^1, s^2, s^3 \in S. \ ((\forall v \in I_c \cup O_c. v(s^1) = v(s^2)) \land (s^1, s^3) \in R_c)$$

$$\Rightarrow \exists s^4 \in S. ((s^2, s^4) \in R_c \land (\forall v \in O_c. v(s^3) = v(s^4)))$$

A component is enabled in state $s$ if there is another state $s'$ such that $(s, s')$ is in $R_c$.

$$enabled(c, s) \equiv \exists s'. (s, s') \in R_c$$

It follows from equation 2.3 that if there are two states, such that $c$ is enabled in one but not the other, then these states must differ in the value of some input or output of $c$.

$$\forall s, s' \in S. ((enabled(c, s) \land \neg enabled(c, s')) \Rightarrow (\exists v \in I_c \cup O_c. v(s) \neq v(s')))$$

Continuing the FIFO example, the input set, $I_{clt}$ of C-element c1t from figure 2.1 is $\{x[0].t, x[1].r\}$ and the output set, $O_{clt}$ is $\{x[1].t\}$. State transition
relations for components \( \text{clt}, \text{src}, \) and \( \text{snk}, \) are given below,

\[
R_{\text{clt}} = \{ (s^1, s^2) \in S \times S \mid \\
(x[1] \cdot r(s^1) = x[0] \cdot t(s^1)) \land (x[1] \cdot t(s^1) \neq x[0] \cdot t(s^1)) \\
\land (x[1] \cdot t(s^2) = x[0] \cdot t(s^1)) \\
\land \forall v \in (V - \{x[1] \cdot t\}). u(s^2) = v(s^1) \}
\]

\[
R_{\text{src}} = \{ (s^1, s^2) \in S \times S \mid \\
(x[0] \cdot t(s^1) \lor x[0] \cdot f(s^1)) \neq (x[0] \cdot r(s^1)) \\
\land (x[0] \cdot t(s^2) \lor x[0] \cdot f(s^2)) = (x[0] \cdot r(s^1)) \\
\land \neg(x[0] \cdot t(s^2) \land x[0] \cdot f(s^2)) \\
\land \forall v \in (V - \{x[0] \cdot t, x[0] \cdot f\}). u(s^2) = v(s^1) \}
\]

\[
R_{\text{snk}} = \{ (s^1, s^2) \in S \times S \mid \\
(x[4] \cdot r(s^1) = (x[4] \cdot t(s^1) \lor x[4] \cdot f(s^1))) \\
\land (x[4] \cdot r(s^2) \neq (x[4] \cdot t(s^1) \lor x[4] \cdot f(s^1))) \\
\land \forall v \in (V - \{x[4] \cdot r\}). u(s^2) = v(s^1) \}
\]

where \( S = 2^V \) with \( V \) from equation 2.1.

Given a set of nodes \( V \), a set of circuit components \( C \), and a set of environment components \( E \), we define the state transition relation for the circuit and its environment as the union of the state transition relations for each component. Let \( R \) denote this state transition relation.

\[
R = \bigcup_{c \in C \cup E} R_c \tag{2.6}
\]

If more than one component is enabled in state \( s \), then \( s \) can have several possible
successors. This non-determinism corresponds to unspecified delays for the components. Furthermore, each successor corresponds to the action of a single component. This gives an "interleaved semantics" for the circuit. Section 2.4 shows that allowing simultaneous actions of different components does not affect the properties which can be specified in local formulas.

To verify properties of a circuit, we consider assertions about reachable states. The set of reachable states is given by the circuit's initial state predicate \( Q_0 \) and its state transition relation \( R \). A trace is a sequence of states such that the first state in the sequence satisfies \( Q_0 \) and each pair of successive states in the sequence is in \( R \). We will write \( \text{traces}(R, Q_0) \) to denote the set of all traces which can be formed using the transition relation \( R \) and whose initial state satisfies the predicate \( Q_0 \). If \( t \) is a trace, we write \( t^0 \) to denote the initial (first) state of \( t \), and \( t^k \) to denote the \( k+1 \)st element of the sequence. A trace is either infinite or ends in a terminal state, where \( s \in S \) is terminal if there is no state \( s' \) such that \((s, s') \in R\). A state \( s \) is reachable if there is a trace \( t \) such that \( s \) appears in \( t \). The predicate \( \text{reachable}(R, Q_0, s) \) denotes that \( s \) is reachable by a circuit and environment with combined state transition relation \( R \), starting from a state satisfying \( Q_0 \).

### 2.2 Definitions

This thesis considers speed-independent circuits whose circuit components have no output choice and whose environment components have stable inputs. A circuit is speed-independent if once a component is enabled to change its output(s), it remains enabled to perform that change until it performs some action. A component has output choice if it can be enabled to perform one of two or more mutually exclusive actions. A component has stable inputs if once it is enabled, the values
of its input nodes remain unchanged until it performs an action. Discussion is restricted to circuits that are speed-independent, with circuit components that do not have output choice, and where the environment components have stable inputs. A circuit which has these properties will be called an eligible circuit.

**Definition 1 (Speed-Independence)** Given a circuit \((V, R, C, E, Q_0)\), the circuit and its environment are speed-independent iff

\[
\forall c \in C \cup E. \forall (s^1, s^2) \in R_c. \forall s^3 \in S.
((s^1, s^3) \in R - R_c) \land \text{reachable}(R, Q_0, s^1) \Rightarrow
(\exists s^4 \in S. ((s^3, s^4) \in R_c) \land (\forall v \in O_c. (v(s^2) = v(s^4))))
\]

**Definition 2 (Output Choice)** Given a circuit \((V, R, C, E, Q_0)\) and a component \(c \in C \cup E\), \(c\) has output choice iff

\[
\exists (s, s^1), (s, s^2) \in R_c. \text{reachable}(R, Q_0, s) \land (s^1 \neq s^2)
\]

**Definition 3 (Stable Inputs)** Given a circuit \((V, R, C, E, Q_0)\) and a component \(c \in C \cup E\), \(c\) has stable inputs iff

\[
\forall (s, s') \in R. (\text{reachable}(R, Q_0, s) \land \text{enabled}(c, s)) \Rightarrow
((s, s') \in R_c) \lor (\forall v \in I_c. v(s) = v(s'))
\]

Note that if \((s, s') \notin R_c\) then \(c\)'s outputs cannot change either, by the definition of outputs given in equation 2.2.

**Definition 4 (Eligible Circuits)** A circuit \((V, R, C, E, Q_0)\) is eligible iff it is speed-independent, no component in \(C\) has output choice, and every component in \(E\) has stable inputs.

Section 4.3 describes how the FIFO from figure 2.1 can be shown to be speed-independent and that all of its components have stable inputs. Only the src
component has output choice: from a state where \( x[0].t \) and \( x[0].f \) are false and \( x[0].r \) is true, the source may set either \( x[0].t \) or \( x[0].f \) to true, but not both (see equation 2.5). The FIFO is an example of an eligible circuit.

The next five definitions describe technical properties of traces that are needed for the proofs in section 2.4 and appendix A. A trace is weakly fair if every persistently enabled component eventually performs an action. The \( \text{occur} \) and \( \text{index} \) functions give the number and locations of occurrences of actions by a particular component in a trace. Two traces are choice consistent if each environment component outputs the same sequence of values given the same sequence of inputs. Splicing is an operation that combines two traces while preserving properties of local formulas and weak-fairness.

**Definition 5 (Weakly Fair Trace)** Given a circuit \((V, R, C, E, Q_0)\), a trace \( t \) is weakly fair iff

\[
\forall c \in C \cup E. \forall i \in N. \text{enabled}(c, t^i) \Rightarrow \exists j \in N. (j \geq i) \land ((t^i, t^{i+1}) \in R_c) \lor \neg \text{enabled}(c, t^j)
\]

Note that for a weakly fair trace of a speed-independent circuit, any enabled component eventually performs an action.

**Definition 6 (Occurrence of a Transition)** Given a component \( c \), a trace \( t \), and a natural number \( k \), \( \text{occur}(c, t, m) \) denotes the number of times that \( c \) occurs in \( t^0, \ldots, t^m \), formally defined as follows.

\[
\begin{align*}
\text{occur}(c, t, 0) & \equiv 0 \\
\text{occur}(c, t, m) & \equiv \text{occur}(c, t, m - 1) \quad \text{if } m > 0 \text{ and } \forall v \in O_c. v(t^{m-1}) = v(t^m) \\
\text{occur}(c, t, m) & \equiv \text{occur}(c, t, m - 1) + 1 \quad \text{if } m > 0 \text{ and } \exists v \in O_c. (v(t^{m-1}) \neq v(t^m))
\end{align*}
\]
Definition 7 (Index of a Transition) Given a component $c$, a trace $t$, and a natural number $k$, $\text{index}(c,t,k)$ denotes the location of the $k^{th}$ transition of component $c$ in trace $t$. That is,

$$\text{index}(c,t,k) = \min\{m \in \mathbb{N} \mid \text{occur}(c,t,m) = k\}$$

Let $t$ be a trace of a circuit that includes a component $c$. We write $t^1_c$ to denote the state immediately before the first time component $c$ effects a state transition in $t$ and $t^1_c.post$ to denote the state immediately after this transition. More generally, $t^k_c = t^{\text{index}(c,t,k)-1}_c$ and $t^k_c.post = t^{\text{index}(c,t,k)}$. Since the components' output sets partition $V$, there can be no change on any $v \in O_c$ from $\text{index}(c,t,k)$ to $\text{index}(c,t,k+1) - 1$. This means that assertions about $t^k_c.post$ can be expressed using $t^{k+1}_c$. As a result, assertions about a node's inputs at the time of its $k^{th}$ action are written about the state $t^k_c$ while assertions about the outputs of the $k^{th}$ action are commonly written about the state $t^{k+1}_c$. The notation $v^k_c(t)$ denotes $v(t^{k}_c)$, and $\text{valid}(v^k_c,t)$ indicates that component $c$ is enabled at least $k$ times in $t$.

Definition 8 (Choice Consistent Traces) Given an eligible circuit $(V,R,C,E,Q_0)$, let $c$ be a component in $E$ and let $t$ and $u$ be traces. The actions of $c$ are choice consistent in $t$ and $u$ (written $t \succeq_c u$) if, given the same
history of inputs, c produces the same outputs in both traces. More formally,

\[ t \equiv_c u \equiv (t^0 = u^0) \land \forall k \in N. \text{PreEquiv}(c, t, u, k) \Rightarrow \text{PostEquiv}(c, t, u, k) \]

where

\[
\text{PreEquiv}(c, t, u, k) \equiv \forall v \in I_c \cup O_c. \forall j \in \{1 \ldots k\}. \\
\quad \text{valid}(v^j_c, t) = \text{valid}(v^j_c, u) \\
\quad \land \quad \text{valid}(v^j_c, t) \Rightarrow (v(t^j_c) = v(u^j_c))
\]

\[
\text{PostEquiv}(c, t, u, k) \equiv \forall v \in O_c. \forall j \in \{1 \ldots k\}. \\
\quad (\text{valid}(v^j_c, t) \Rightarrow (v(t^j_c.post) = v(u^j_c.post)))
\]

If \( D \) is a set of components, \( t \equiv_D u \) denotes \( \forall d \in D. t \equiv_d u \). Consistency is an equivalence relation.

If \( t \) is a trace, each pair of successive states in \( t \) can be associated with the component that effected the transition. Given two choice consistent traces \( t \) and \( u \), \( \text{splice}(t, u, m) \) produces a sequence where the first \( m \) transitions are effected as in \( t \), the corresponding transitions are deleted from \( u \), and the remainder of \( u \) is appended to the prefix from \( t \). Theorem 1 will show that \( \text{splice}(t, u, m) \) is a trace. Figure 2.2 shows two traces and their first six splices (where \( c^i \) denotes the \( i^{th} \) action by component \( c \)).

**Definition 9 (Splicing Traces)** Given an eligible circuit \( (V, R, C, E, Q_0) \), let \( t \) and \( u \) be traces with \( t^0 = u^0 \) and \( t \equiv_E u \). Let \( m \) be a natural number such that \( t^m \) exists. The \( m^{th} \) splice of \( t \) and \( u \) is defined inductively. Let \( c_m \) be the component which makes its \( n^{th} \) action going from the state \( t^{m-1} \) to the state \( t^m \), and let \( i_m \) be the index of the point where \( c_m \) makes its \( n^{th} \) action in \( \text{splice}(t, u, m - 1) \), if \( c_m \) ever...
Figure 2.2: Splicing Traces

makes such an action. Formally,

\[(t^m - 1, t^m) \in R_{c_m}\]

\[n = \text{occur}(c_m, \text{splice}(t, u, m - 1), m - 1) + 1\]

\[i_m = \text{index}(c_m, \text{splice}(t, u, m - 1), n) \text{if it exists}\]

If \(i_m\) exists, then \(\text{splice}(t, u, m)\) is the sequence that satisfies the following equations

\[\text{splice}(t, u, 0) \equiv u\]

\[v(\text{splice}(t, u, m)^k) \equiv v(t^k) \quad \text{if } k < m \text{ and } v \in V\]

\[v(\text{splice}(t, u, m)^k) \equiv v(t^m) \quad \text{if } m \leq k < i_m, \text{ and } v \in O_{c_m}\]

\[v(\text{splice}(t, u, m)^k) \equiv v(\text{splice}(t, u, m - 1)^{k-1}) \quad \text{if } m \leq k < i_m, \text{ and } v \notin O_{c_m}\]

\[v(\text{splice}(t, u, m)^k) \equiv v(\text{splice}(t, u, m - 1)^k) \quad \text{if } i_m \leq k\]

if \(i_m\) does not exist then \(\text{splice}(t, u, m)\) is not defined.

The following property gives a sufficient condition for \(\text{splice}(t, u, m)\) to be defined, and it is shown in theorem 1 of section 2.4 that splices are traces.

**Property 1 (Splicing)** Given an eligible circuit \((V, R, C, E, Q_0)\), let \(t\) and \(u\) be traces with \(t^0 = u^0, t \equiv_E u\). If \(t\) is weakly fair, then \(\text{splice}(t, u, m)\) exists.
The combination of weak fairness and speed-independence ensures that \( i_m \) exists.

The next four definitions describe the properties of traces that are used for specification and verification. **Local formulas** refer to the values of the inputs and outputs of a component when the component performs its \( k^{th} \) action, specifying properties of bounded prefixes of traces. Two traces are **equivalent** if they have the same truth values for all local formulas. **Global formulas** are predicates over states. **Periodic global formulas** and **shifted local formulas** are used to specify safety properties of unbounded traces.

**Definition 10 (Local Formula)** A local formula is either
\[
\nu^k_c \quad \text{where } \nu \in I_c \cup O_c, \ k \in N, \ \text{and } c \text{ is a component with stable inputs}
\]
or
\[
\neg f \quad \text{where } f \text{ is a local formula}
\]
or
\[
f \wedge g \quad \text{where } f \text{ and } g \text{ are local formulas.}
\]

A local formula \( L \) is valid for trace \( t \) if \( \text{valid}(\nu^k_c, t) \) is true for each term of the form \( \nu^k_c \) in \( L \). If \( L \) is valid for trace \( t \), then it holds if the values of the nodes of the circuit in trace \( t \) satisfy the formula. The predicates \( \text{valid}(L, t) \) and \( L(t) \) denote that \( L \) is valid for trace \( t \) and that \( L \) holds for trace \( t \) respectively, and the notation \( \text{comp}(L) \) indicates the set of components which appear in the local formula. Finally, \( (R, Q_0) \models L \) denotes that \( L \) is valid and holds for every weakly fair trace of the circuit with state transition relation \( R \) and initial state predicate \( Q_0 \).

**Definition 11 (Equivalent Traces)** Two traces \( t \) and \( u \) with \( t^0 = u^0 \) are equivalent iff for all local formulas \( L \)
\[
(\text{valid}(L, t) = \text{valid}(L, u)) \land (\text{valid}(L, t) \Rightarrow (L(t) = L(u)))
\]
The notation $t \equiv u$ denotes that the traces $t$ and $u$ are equivalent. Because environment components of an eligible circuit have stable inputs, $t \equiv u \Rightarrow t \equiv_E u$.

**Definition 12 (Global Formula)** A global formula is a predicate over states.

**Definition 13 (Shifted Local Formula)** Let $L$ be a local formula and $\text{comp}(L)$ be the set of components that appear in terms of $L$. Let $i$ be a natural number and $p$ be a function from $\text{comp}(L)$ to the naturals. The shift of $L$ by $i$ times $p$ is obtained by replacing every term of the form $v_c^k$ in $L$ with $v_c^{i \cdot p(c) + k}$.

The notation $\text{shift}(L, p, i)$ denotes the shift of $L$ by $i$ times $p$.

**Definition 14 (Periodic Global Formula)** Let $(V, R, C, E, Q_0)$ be an eligible circuit. Let $G$ be a global formula, let $C' \subseteq C \cup E$ be a set of components, and let $p$ be a function from $C'$ to the natural numbers, with $p(c) > 0$ for all $c \in C'$. $G$ has period $p$ if only components in $C'$ are enabled in states that satisfy $G$, and for every trace whose initial state satisfies $G$ there is a choice consistent trace that returns to a state satisfying $G$ after each component $c$ in $C'$ has performed $p(c)$ actions. More formally, $G$ is periodic with period $p$ iff

$$\forall s \in S. \forall c \in (C \cup E) - C'. G(s) \Rightarrow \neg \text{enabled}(c, s)$$

$$\land \forall t \in \text{traces}(R, G). \exists u \in \text{traces}(R, G). \exists m \in N. \quad (t \equiv_E u) \land G(u^m) \land (\forall c \in C'. \text{occur}(c, u, m) = p(c))$$

The predicate $\text{periodic}(G, C', p)$ indicates that the global formula $G$ is periodic with period $p$ on the components in the set $C'$.

### 2.3 A Specification for the FIFO

Local formulas provide a natural way to specify the input/output behaviour of speed-independent circuits. For example, a natural specification for the FIFO shown in
figure 2.1 is that the infinite sequence of values output by the source must be the
same as the sequence received by the sink. This is expressed by the formula

\[ \forall i \in \mathbb{N}. ((x[0] . t)^i_{src}.post = x[4] . t^i_{snk}) \land ((x[0] . f)^i_{src}.post = x[4] . f^i_{snk}) \]

which is equivalent to (dropping the post notation)

\[ \forall i \in \mathbb{N}. (x[0] . t^{i+1}_{src} = x[4] . t^i_{snk}) \land (x[0] . f^{i+1}_{src} = x[4] . f^i_{snk}) \] (2.7)

Recall from the discussion after definition 7 that \( x[0] . t^i_{src} \) refers the value of \( x[0] . t \) immediately before the src component performs its \( i+1 \)st operation. This is the value that was output by the source at its \( i \)th operation. On the other hand, \( x[4] . t^i_{snk} \) is the value that was input to the snk component at its \( i \)th operation.

Local formulas can only specify properties with a finite number of terms. Periodic global formulas extend the result to infinite sequences. For the FIFO example, it is convenient to assume that the FIFO is initially reset to a state where all stages hold empty values (as in equation 2.1), and consider a period function that returns the FIFO to this state. Because the source alternates between sending empty and full values, the source must perform two operations per period, as must the sink. This motivates the following choices of \( G \) and \( p \).

\[
p(src) \equiv 2 \\
p(snk) \equiv 2 \\
G \equiv Q_0
\] (2.8)

where \( Q_0 \) is given in equation 2.1. Note that this specification does not say that the FIFO periodically returns to an empty state for all traces. It merely requires that there exists at least one trace for which the FIFO has this behaviour. Let \( L \) be the

24
local formula defined below.

$$L \equiv (x[0] . t^2_{src} = x[4] . t^1_{sink}) \land (x[0] . f^2_{src} = x[4] . f^1_{sink})$$
$$\land (x[0] . t^3_{src} = x[4] . t^2_{sink}) \land (x[0] . f^3_{src} = x[4] . f^2_{sink})$$

(2.9)

Intuitively, the first line states that the data is propagated correctly, and the second states that the empty is propagated correctly. Theorem 3 in the next section shows that establishing that \((R, G) \models L\) and that \(G\) has period \(P\) is sufficient to establish \(\forall i \in N. (R, G) \models shift(L, p, i)\), which for \(G, L,\) and \(p\) as defined above is equivalent to the specification given in formula 2.7. Since the formula \(L\) asserts that the first value output by the circuit is equal to the first value supplied, showing \(shift(L, p, i)\) shows that for any \(i\), the \(i^{th}\) value output by the circuit is equal to the \(i^{th}\) value supplied, which verifies correct FIFO behaviour.

### 2.4 Properties of Speed-Independent Circuits

The splice operation has two properties that provide the foundation for this work. First, the splice of two weakly fair traces is a weakly fair trace. Second, splicing preserves the values of local formulas. These properties motivate the following theorem.

**Theorem 1 (Splicing)** Given an eligible circuit \((V, R, C, E, Q_0)\), let \(t\) and \(u\) be weakly fair traces with \(m\) a natural number such that \(t^m\) exists. If \(t^0 = u^0\) and \(t \sqsupseteq E u\), then splice\((t, u, m)\) is a weakly fair trace and splice\((t, u, m) \equiv u\).

**Proof** (sketch, see appendix A for details): The key observation is that the actions of concurrently enabled components commute without changing the values of local formulas or affecting the property of weak fairness. Let \(\alpha\) be a trace, \(i\) be a natural
number, and $c$ and $d$ be components such that $(\alpha^i, \alpha^{i+1}) \in R_c$, $(\alpha^{i+1}, \alpha^{i+2}) \in R_d$, and $d$ is enabled in state $\alpha^i$. Let $\beta$ be the sequence such that

\[
\beta^k = \alpha^k \quad \text{if } k \neq i + 1
\]
\[
v(\beta^{i+1}) = v(\alpha^{i+2}) \quad \text{if } v \in O_d
\]
\[
v(\beta^{i+1}) = v(\alpha^i) \quad \text{if } v \notin O_d
\]

Note that any component that has output choice must be an environment component and therefore has stable inputs. This and speed-independence ensure that $(\beta^i, \beta^{i+1}) \in R_d$ and $(\beta^{i+1}, \beta^{i+2}) \in R_c$. Thus, $\beta$ is a trace. Given an arbitrary component $e$ with stable inputs, it is shown below that $v^k_e$ has the same validity and value in traces $\alpha$ and $\beta$ by case analysis on $\text{index}(e, \alpha, k)$, and that if $\alpha$ is weakly fair then $\beta$ is weakly fair as well.

The theorem is proven by induction on $m$. The base case is $m = 0$, which is discharged by noting that $\text{splice}(t, u, 0) = u$. For $m > 0$, note that the component that effects the transition from $t^{m-1}$ to $t^m$ must be enabled in state $\text{splice}(t, u, m - 1)^{m-1}$. By the induction hypothesis, $\text{splice}(t, u, m - 1)$ is weakly fair and $t \succeq_E \text{splice}(t, u, m - 1)$; thus, this component must eventually perform an action, and any environment choices must be identical in both traces. Therefore, the sequence $\text{splice}(t, u, m)$ can be obtained from trace $\text{splice}(t, u, m - 1)$ by a finite number of commuting operations, and the claims of the theorem follow from the properties of commuting.

\[\square\]

The key property of speed-independent circuits that we exploit is that if two traces for a circuit start in the same state and the environment provides the same inputs, then all components with stable inputs have identical input and output sequences in both traces. This equivalence is formalised by observing that both
traces satisfy the same local formulas.

**Theorem 2 (Equivalence of Traces)** Given an eligible circuit \((V, R, C, E, Q_0)\), if \(t\) and \(u\) are weakly fair traces with \(t^0 = u^0\) and \(t \equiv_E u\), then \(t \equiv u\).

**Proof:** Because every local formula \(L\) can be represented by terms of the form \(v_c^k\) and the operators \(-\) and \(\wedge\), if every term of the form \(v_c^k\) has the same validity and value in \(t\) and \(u\) then, by induction on the structure of \(L\), the theorem holds.

Let \(v_c^k\) be a term of \(L\), where \(c\) is a component with stable inputs. If \(v_c^k\) is valid in trace \(t\), let \(\alpha = \text{splice}(t, u, \text{index}(c, t, k))\). By the definition of splicing, \(v_c^k\) is valid in both \(t\) and \(\alpha\), and \(v_c^k(t) = v_c^k(\alpha)\). By theorem 1, \(v_c^k(\alpha) = v_c^k(u)\). This establishes \(v_c^k(t) = v_c^k(u)\) as required. A symmetric argument applies if \(v_c^k\) is valid in \(u\). Finally, the case where \(v_c^k\) is not valid in either \(t\) or \(u\) also satisfies the claim of the theorem.

\(\Box\)

Theorem 2 provides the basis for verifying specifications of the form \((R, Q_0) \models L\). If the circuit satisfies the hypotheses of theorem 2, it is only necessary to verify a single trace for each initial state and sequence of inputs from the environment to show that all possible traces starting from an allowable initial state and with suitable environment behaviour will satisfy \(L\). Chapter 4 describes how this can be shown using STE. However, a local formula only specifies properties of a bounded number of actions of each component. Infinite traces can be verified by using periodic global formulas and shifted local formulas (as defined in section 2.2).

**Theorem 3 (Recurring Local Formulas)** Given an eligible circuit \((V, R, C, E, Q_0)\), let \(G\) be a global formula and \(L\) be a local formula such that \(Q_0 \Rightarrow G\) and \((R, G) \models L\). Let \(p\) be a mapping from \(\text{comp}(L)\) to the natural numbers.
Then,

\[ \text{periodic}(G, \text{comp}(L), p) \Rightarrow \forall i \geq 0. (R, G) \models \text{shift}(L, p, i) \]

**Proof:** Let \( t \) be a trace of \( (R, Q_0) \). The proof proceeds by induction on \( i \), with induction hypothesis

\[(R, G) \models \text{shift}(L, p, i)\]

\[\land \forall t \in \text{traces}(R, G). \exists u \in \text{traces}(R, G). \exists m \in \mathbb{N}.
\]

\[(u^t_{\text{traces}(R, G)}) \land \forall c \in \text{comp}(L). \text{occur}(c, u, m) = (i + 1) \ast p(c)\]

\[\land \forall c \in (C \cup E) - \text{comp}(L). \neg \text{enabled}(c, u^m)\]

\[\land G(u^m))\]

The base case is \( i = 0 \). From \( \text{shift}(L, p, 0) = L \) and \( (R, G) \models L \) we see that

\[(R, G) \models \text{shift}(L, p, 0)\].

The second half of the conjunction follows from the definition of periodic global formulas. The induction step relies on properties of the subtrace \( u' = u^m, u^{m+1}, \ldots \) where \( m \) is provided by the induction hypothesis for \( i - 1 \). Since \( u' \) is in \( \text{traces}(R, G) \) we can see that \( L \) holds for \( u' \), and since \( \text{occur}(c, u, m) = i \ast p(c) \), this is equivalent to \( (R, G) \models \text{shift}(L, p, i) \). The second half of the conjunction follows from the definition of periodic global formulas as applied to \( u' \). This completes the proof.

\( \square \)

When using symbolic trajectory evaluation for verification, it is convenient to perform the actions of all enabled components as a single state transition. The next theorem shows that the validity and value of local formulas is preserved under
such an execution model. Let $R$ be a state transition relation, and let

$$R^+ = \{ (s^1, s^2) \in S \times S \mid$$

$$\forall c \in C \cup E.$$ 

$$\exists s' \in S. ((s^1, s') \in R_c) \land (\forall v \in O_c. v(s^2) = v(s'))$$

$$\lor \neg enabled(c, s^1) \land (\forall v \in O_c. v(s^2) = v(s^1))$$

$$\}$$

Transition $(s^1, s^2)$ is in $R^+$ iff $s^2$ is the state reached by performing an action of each component that is enabled in state $s^1$ in a single state transition. Note that the definition of occur does not require that components act one at a time.

**Theorem 4 (Combined Actions)** Given an eligible circuit $(V, R, C, E, Q_0)$, for all local formulas $L$,

$$(R, Q_0) \models L \iff (R^+, Q_0) \models L$$

**Proof** (sketch, see appendix A for details): Let $u$ be a trace of $(V, R^+, C, E, Q_0)$. We construct a trace $t$ of $(V, R, C, E, Q_0)$ such that $t \equiv u$. The key idea is that if $u$ includes a transition $(t^i, t^{i+1})$ where $n$ components perform actions, then we can effect the same change in $t$ with $n$ separate transitions. Because the original circuit is speed-independent, each component that was enabled in state $u^i$ is enabled when its turn arrives. Furthermore, because any component that occurs in a term of $L$ must have stable input, all input nodes of the component have the same value when the component performs its action in $t$ as in $u$. This guarantees that $L$ has the same value and validity in $t$ as in $u$.

To see the converse, consider a weakly-fair trace $t$ in $(V, R, C, E, Q_0)$. The proof proceeds by creating two traces, $u$ in $traces(R^+, Q_0)$ and $\alpha$ in $traces(R, Q_0)$, such that $\alpha$ and $u$ have the same relationship as the one described in the preceding.
paragraph for \( t \) and \( u \), and \( t \cong \alpha \). The traces are formed by considering the set \( D^1 \) of components which are enabled in \( t^0 \). The trace \( u \) is formed by the simultaneous action of all components in \( D^1 \), while \( \alpha \) is the trace formed by having them all as the first \( |D^1| \) actions in \( \alpha \). The process continues with the set \( D^2 \) of components enabled in \( u^1 \).

\( \square \)

2.5 Discussion of Hypotheses

While the requirements of the theorems in section 2.4 may seem technical and restrictive, many of them have natural intuitions and are satisfied by common design methods. One of the basic requirements is that the circuit to be verified be eligible. This means that the circuit must be speed-independent, that none of its components have output choice, and that all of its environment components have stable inputs. Speed-independence has been chosen as a requirement because it is a readily testable guarantee of freedom from timing hazards. One key point for the next two properties is that for the proof of theorem 1 to work every component must either have stable inputs or no output choice. The formalism also only permits the verification of local formulas which describe the inputs and outputs of components with stable inputs. This makes the choice to restrict components with output choice to the environment reasonable, as it allows us to represent arbitrary data values as inputs to the circuit and guarantees that we will be able to make assertions about any node which connects the circuit to its environment. On the other hand, it does prevent us from handling circuits which contain arbiters.

Now consider theorem 2. In many ways it provides the foundation of this work. Its main requirement is that the two traces \( t \) and \( u \) be choice equivalent.
Although the definition of of choice equivalent traces is somewhat complicated, the meaning is simply that the environment provides the same stimulus to the circuit in both traces. Theorem 3 requires that the global formula $G$ be **periodic**, which means that we could choose an order for components to act such that we arrive back at $G$. Thus, the theorem's conclusion is that we can build arbitrarily long representative traces by returning repeatedly to states which satisfy $G$. Finally, theorem 4 requires that we have a transition relation $R^+$ which contains transitions formed by simultaneously performing all enabled transitions in $R$. 
Chapter 3

Local and Global Formulas

The combined use of global and local formulas is essential to the power of this verification method. They have different expressive abilities, and understanding their function and use is required before applying this theory to practical problems. In this chapter, the two different asynchronous implementations of a 4-bit adder as shown in figure 3.1 are used to illustrate a discussion of the two types of formulas. The local formulas describing correct function are discussed in section 3.1 and the global formulas and period functions describing the adders' periodic behaviour are discussed in section 3.2. Finally, section 3.3 describes a form of behaviour which cannot be verified using this technique.

Both designs use the same dual-rail codes as the FIFO described in section 2.1 and are speed-independent. The design on the left calculates the carry for each input pair and carry-in only after all three inputs are valid. This satisfies the stable inputs condition. In order to create a pipeline and allow a possible throughput of one operand per component delay, the adding components are embedded along the diagonal in a block of buffers (FIFO cells). In the second (flat) adder, the carry unit
can produce a result as it receives two inputs of the same value, creating a data-dependent throughput. This is still speed-independent, even though the carry unit does not have stable inputs. This realizes a logarithmic average-case latency [Sei79]. Both adders are verified using the same environment, shown in figure 3.2.

### 3.1 Local Formulas

Local formulas are a form of modal logic specifically designed to describe invariant properties of speed-independent asynchronous circuits. Given the same initial state and inputs from the environment, a speed-independent circuit will produce the same sequence of values on the nodes attached to components with stable inputs. However, because the circuit is asynchronous, it is not possible to know the timing of these values. Local formulas are assertions about properties which are independent of the relative timing of components. If a local formula only refers to the inputs and outputs of a circuit, then it is also independent of the internal circuit structure. This allows the same specification to be used with two different implementations, such as the two adders described above.
Recall from definition 10 that local formulas are boolean expressions formed from terms of the form $v^k$, where $v$ is an input or output node of a component $c$ with stable inputs. The index $k$ denotes that we are describing the state of the node $v$ at the $k^{th}$ action of $c$. Since environment nodes are required to have the property of stable inputs, and they are the place where the actions of the system are apparent to the outside world, they are natural candidates for acting as the components $c$ in local formulas. Also recall that $v^k$, refers to the value of node $v$ at the instant before the $k^{th}$ action by $c$. Accordingly, the output on node $v$ for $c$'s first action is expressed as $v^2$. Finally, local formulas are typically used to describe a single step of a system's function, which in this case is a single set of operands.

With this in hand, we can describe the desired functionality of the 4-bit adder. For the sake of conciseness, $[A^2_{srcA3}, A^2_{srcA2}, A^2_{srcA1}, AO^2_{srcA0}]$ is used to repre-
sent the bit vector supplied by the source components for the \( a \) operand with their first action, and binary addition is used as shorthand for the logical notation using \( \wedge \) and \( \neg \).

\[
[S_{snks4}, S_{snks9}, S_{snks8}, S_{snks1}, S_{snks0}] \\
= [0, A_{srca3}^2, A_{srca2}^2, A_{srca1}^2, A_{srca0}^2] + [0, B_{srcb3}^2, B_{srcb2}^2, B_{srcb1}^2, B_{srcb0}^2] \\
+ [0, 0, 0, 0, \text{Cin}_{srccin}^2]
\]  

(3.1)

Since it refers only to nodes through their association with environment components, this specification does not reflect the internal structure of the adder. This allows the same specification to be used for any implementation of the adder.

### 3.2 Global Formulas and Period Functions

Global formulas serve as an "invariant", to which the system will return after one of the cycles described by the period function \( p \). These formulas are not genuine invariants, however, for a number of reasons. First, they are not expected to hold throughout the device's computation. Second, the asynchronous implementation is not required to return to a state where this formula holds. However we can think of the system as returning to the state \( G \) because if there is one trace for which the global formula recurs with period \( p \), then by the equivalence of traces, conclusions drawn for this trace can be applied to all traces with the same inputs from the environment.

Before further discussion of the conceptual basis and use of global formulas and periodic functions, let us recall their definitions. A global formula \( G \) is a predicate over the instantaneous state of the system, and the period function \( p \) is a function from a set \( C' \subset C \) to the counting numbers such that only components in
$C'$ are enabled in a state satisfying $G$ and for every trace whose initial state satisfies $G$ there is a choice consistent trace that returns to a state satisfying $G$ after each component $c$ in $C'$ has performed $p(c)$ actions. Also recall that the set $C'$ of components whose behaviour is described by $p$ must contain all of (and typically only) the components which appear in the local formulas. Thus, the period function $p$, like the local formula, should not change with different implementations of the device.

Most devices can be thought of as having clear cycles of operation. For the FIFO this is the insertion and removal of a single piece of data; for the adder it is the production of one sum from two operands. If we consider an environment which supplies one set of data and then waits until the outputs have been produced, we can see that the system returns to a particular state. This is the state which the global formulas describe. Note that this state may never reappear after the initial situation because the real environment may supply data before the outputs have been extracted. However, we know that we could have waited for the outputs to appear, which means that each set of data can be considered as if it were the first. As established by theorem 2, having a representative trace which exhibits correct behaviour ensures that all traces behave correctly. We consider the trace which returns to the state described by the global formula after each computation, and it serves as our representative trace.

There is a high level of interdependence between different nodes in asynchronous circuits, due to the handshaking protocols. As a result, any unspecified values in the initial state will quickly propagate through the entire system. Thus, the global formulas typically describe the state of every node, and are dependent on the implementation details. If we consider the two adders, both of their global formulas describe them in an empty state. However, since the pipelined adder sys-
Figure 3.3: Pipelined Asynchronous Adder
tem contains more components, its global formula will specify that they are empty, in addition to the components which it shares with the flat adder. In this case the global formulas are similar, but in general there is no requirement that they match on anything other than specifying the initial state of the nodes connected to the environment.

Figure 3.3 shows the internal structure of the pipelined adder. The flat adder has the same structure with all of the FIFO elements removed. For simplicity’s sake, consider a generic predicate empty which takes lists of components as its argument and is true if any components with a data output have their output set to the dual rail code for empty, any components with acknowledge lines have this signal set to show an empty state, and any internal state is set in a manner which does not enable any transitions. The global formulas for the two adders can then be expressed by figures 3.4 and 3.5 below.

\[
\text{empty \{srcci, srca0, srcb0, srca1, srcb1, srca2, srcb2, srca3, srcb3, fifo01a, fifo01b, fifo02a, fifo02b, fifo03a, fifo03b, fifo10a, fifo12a, fifo12b, fifo13a, fifo13b, fifo20a, fifo23a, fifo03s, fifo30s, fifo31s, fifo32s, maj0, maj1, maj2, maj3, vjoinv0, vjoinv1, vjoinv2, vjoinv3, xor0, xor1, xor2, xor13, snk0, snk1, snk2, snk3, snk4\}}
\]

Figure 3.4: Global Formula for Pipelined Adder

\[
\text{empty \{srcci, srca0, srcb0, srca1, srcb1, srca2, srcb2, srca3, srcb3, maj0, maj1, maj2, maj3, vjoinv0, vjoinv1, vjoinv2, vjoinv3, xor0, xor1, xor2, xor13, snk0, snk1, snk2, snk3, snk4\}}
\]

Figure 3.5: Global Formula for Flat Adder

The period function provides a connection between the global formula and
the local formula. Although it refers to the actions required to return to the global formula, which is implementation dependent, like the local formula it is implementation independent. Since it only needs to describe the periods of components described in the local formula, typically it will describe the number of actions of the environment components at the inputs and outputs of the circuit. This can be seen from the adder examples. The cycle of operation for the adders is receiving two bit arrays of values followed by empty values to separate them from any subsequent data, and producing another array representing the sum (also followed by empty values. Thus, for both adders the period function is 2 for all of the input and output components.

3.3 Unspecifiable Behaviour

This method allows us to specify and verify behaviours which can be described by formulas whose terms are the values of nodes immediately before the $k^{th}$ action of components. We can neither describe nor verify properties which correspond to the relative timing of the actions of distinct components. For example, structural properties of a circuit may maintain invariants (such as the fact that the $n+1^{st}$ input to a $n$ stage FIFO may not be accepted until there has been at least one output action) which can neither be described nor verified by this method. However, the goal was to verify data-path circuits. For such circuits, assertions about the sequences of values are a natural method for specifying behaviour.
Chapter 4

Implementation

This section describes how the results from chapter 2 can be applied using existing verification tools to verify self-timed designs. The overall scheme for the verification is:

- The structure and behaviour of the circuit and its environment are described using VHDL.

- The specification is written as an initial state predicate, a local formula, a global formula, and a period function. These formulas are verified for the circuit using symbolic trajectory evaluation.

- Speed-independence and stable-input assumptions are verified by model checking small clusters of components separately.

4.1 VHDL Circuit Descriptions

"Behavioural" VHDL is used to describe the functionality of the components of the circuit and the environment, and "structural" VHDL is used to describe the
interconnection of these components. The VHDL model provides $R^+$ as described in section 2.4.

Augmenting the models for environment components facilitates verification. In particular, added registers provide data corresponding to the environment’s output choice and record the outputs of the circuit. This matches the specifications which describe the input/output behaviour of the circuit. Since trajectory evaluation only allows assertions about data values at particular times, we add a counter which increments each time a data value is received and use it to direct data values into an array. This bridges the gap between asynchronous behaviour and synchronous specifications. Thus the occurrence number of an action will correspond to its position in the array of recorded values. Counters can also be added to record the number of actions taken by the components, for the purposes of establishing that the period function $p$ is correct and to disable the environment components after the period has completed. Since only components which are in the domain $C$ of $p$ may be enabled in states satisfying the global formula $G$, a correctly functioning circuit will eventually halt in a state satisfying $G$ if all of the components in $C$ are disabled after the number of actions permitted by $p$.

These registers and counters are auxiliary circuitry in the environment components. In order to fully verify the circuit, this auxiliary circuitry must not inappropriately restrain the environment components. It is important that the choice variables stored in the source components’ arrays do not reduce the output choice of these components. However, these variables are symbolic and simply serve as reference names for the values for the purposes of verifying a general formula with only one simulation run. It is also important that the restriction of the components to a fixed number of actions by the counters does not prevent them from correctly
representing general source components. If we were not taking advantage of the periodic behaviour of the circuit, we would require an infinite number of registers initialised to arbitrary boolean variables in order to emulate the output choice of the environment. However, because we know that verifying a finite number of steps is sufficient to show general correctness, only a small number of registers and boolean variables are required.

Three examples of VHDL code excerpts are provided: figure 4.1 shows code for a source which supplies a data value and then stops; figure 4.2 is a sink unit which records the last data value to arrive; and figure 4.3 shows the code for a single FIFO cell. This code is for an FIFO which passes data values in inverted form. Note that some of the internal variables (such as \texttt{index}, and \texttt{done}) must be set by the STE antecedent.

4.2 Symbolic Trajectory Evaluation

Symbolic trajectory evaluation (STE) is a verification technique for circuits with deterministic next state functions which verifies specifications written in a restricted logic called “trajectory formulas”. Although restrictive, trajectory formulas readily express requirements of the form “the value of a node \( n \) at time \( t \) is the function \( f \) of the values of the nodes \( [n_1, \ldots, n_k] \) at times \( [t_1, \ldots, t_k] \)”. Using choice variables to control the output choice of the source components and counter indexed arrays to store inputs to the sink components allows us to represent local formulas using this structure.

The specification that a global formula \( G \) has period \( p \) can also be expressed in this form. After augmenting the models for the environment components with “period counters” as described in section 4.1, we verify that all trajectories that
ENTITY source IS
PORT (sr : IN BIT; -- request signal from first FIFO cell
outT : OUT BIT; -- dual rail TRUE
outF : OUT BIT; -- dual rail FALSE
data : IN BIT -- auxiliary value to fix output choice
);
END source;

ARCHITECTURE data_flow OF source IS
SIGNAL cTout : REG_BIT REGISTER;
SIGNAL cFout : REG_BIT REGISTER;
SIGNAL dindex : REG_BIT REGISTER; -- One data passing action
SIGNAL eindex : REG_BIT REGISTER; -- One empty passing action
BEGIN

-- Transition to supply empties when next cell has data
clr : BLOCK ( NOT eindex AND NOT sr AND (cTout OR cFout) )
BEGIN
cTout <= GUARDED "0";
cFout <= GUARDED "0";
eindex <= GUARDED "1";
END BLOCK clr;

-- Stuff for dealing with supplying data and toggling index
dref0 : BLOCK ( (NOT (cTout OR cFout )) AND sr AND NOT dindex )
BEGIN
cTout <= GUARDED data;
cFout <= GUARDED NOT data;
dindex <= GUARDED "1";
END BLOCK;

outT <= cTout;
outF <= cFout;
END data_flow;

Figure 4.1: VHDL Code for Source Component
ENTITY sink IS
PORT (
  r    : OUT BIT;  -- request signal to last FIFO cell
  inT  : IN BIT;  -- dual rail input for inverted TRUE
  inF  : IN BIT   -- dual rail input for inverted FALSE
);
END sink;

ARCHITECTURE data_flow OF sink IS
  SIGNAL regT    : REG_BIT REGISTER;
  SIGNAL regF    : REG_BIT REGISTER;
  SIGNAL gr      : REG_BIT REGISTER;
  SIGNAL done    : REG_BIT REGISTER;
BEGIN
  -- Transition to acknowledge an empty
  clr : BLOCK ( NOT ( inT OR inF ) )
  BEGIN
    gr   <= GUARDED "0";
  END BLOCK clr;

  -- Stuff for dealing with valid data
  val : BLOCK (( inT OR inF ) AND NOT done)
  BEGIN
    regT   <= GUARDED inT;
    regF   <= GUARDED inF;
    gr     <= GUARDED "1";
    done   <= GUARDED "1";
  END BLOCK;

  r   <= gr;
END data_flow;

Figure 4.2: VHDL Code for Sink Component
ENTITY fifocell IS
PORT (
inT : IN BIT; -- dual rail TRUE input
inF : IN BIT; -- dual rail FALSE input
outT : INOUT BIT; -- dual rail TRUE output
outF : INOUT BIT; -- dual rail FALSE output
r  : OUT BIT; -- request signal from previous FIFO cell
sr : IN BIT -- request signal from next FIFO cell);
END fifocell;

ARCHITECTURE data_flow OF fifocell IS
  SIGNAL gTout : REG_BIT REGISTER;
  SIGNAL gFout : REG_BIT REGISTER;
BEGIN
  ct : BLOCK (( inT = sr ) AND ( gTout XOR inT ))
  BEGIN
    gTout <= GUARDED inT;
  END BLOCK;

  cf : BLOCK (( inF = sr ) AND ( gFout XOR inF ))
  BEGIN
    gFout <= GUARDED inF;
  END BLOCK;

  -- NOR gate
  r <= NOT ( gTout OR gFout );

  -- required because guarded variables can't be outputs
  outT <= gTout;
  outF <= gFout;
END data_flow;

Figure 4.3: VHDL Code for FIFO Cell
start in a state that satisfies $G$ and in a state which also satisfies $G$ and that for each environment variable $e$, the corresponding counter ends with value $p(e)$. The form of trajectory formulas makes this very easy when most components return to an empty state at the end of the period. More care (and therefore more time) is required if data values can remain in the circuit at the end of the period.

4.3 Model Checking for Speed-Independence

Verification using local formulas is based upon the assumption that the circuit is eligible. This means that speed-independence must be verified for the circuit as well as the stable-inputs property required for components appearing in local formulas. In this approach, the designer identifies small sub-circuits (typically single components) whose interfaces observe common self-timed handshaking protocols, and where the speed-independence of the sub-circuit depends only on its environment observing the handshaking protocol, not upon the particular values applied. Therefore, each sub-circuit is model checked with generic sources and sinks attached to the interfaces. These generic environment components have non-deterministic next-state relations that ensure that they observe the handshaking protocol, but allow them to apply arbitrary data values. In this context two properties are verified using a model checker [LGS94a]. First, the sub-circuit is shown to be speed-independent (and to have stable inputs, if required) in this generic environment. Second, the sub-circuit is shown to observe the handshaking protocol as well, by showing that its behaviour on the lines to each neighbour is a refinement of the behaviour between a sink and a source (the simplest circuit which implements the correct handshaking protocol). That this demonstrates the speed-independence of the complete circuit can be seen either from the discussion in [AL90] on composing specifications, or
from the protocol verification described in [SM95]. Because the sub-circuits are small, model checking is fast and completely automatic. The property of stable inputs can also be expressed as a predicate and verified using model checking.

As an example, consider the FIFO shown in figure 2.1. The design is partitioned into six sub-circuits: the source, the sink, and the four FIFO stages where stage 1 consists of components $c1t$, $c1f$, and $nl$, and likewise for stages 2, 3, and 4. The descriptions of the $src$ and $snk$ components in equation 2.5 provide generic sources and sinks for the four-phase handshake protocol of the FIFO. The FIFO stage to be verified is composed with a generic source connected to its data input and a generic sink connected to its output. Model checking shows that this composition is a speed-independent circuit and that at its input interface the FIFO stage only performs actions that could be made by a generic sink and that at its output interface the stage only performs actions that could be made by a generic source.
Chapter 5

Vector Multiplier

For an example with a significant amount of data path complexity, we chose the vector multiplier described in [SS93]. The multiplier uses an iterative serial-parallel algorithm implementing the “paper and pencil” method for multiplication, with the accumulating sum stored in carry-save form until it is to be output. The key functional unit is the MAS (Multiply-Accumulate-Shift) block. With each bit of the serial operand, the parallel operand and the carry word are added to the accumulating sum (the parallel operand is first ANDed with the serial bit), and then the parallel operand and the carry word are shifted to the left, while the sum bits circulate in the MAS bit-slices. When the result is to be produced, the sum is converted from carry-save form by extending the serial operand with enough zeroes to ensure that the carry word is completely cleared. A detailed diagram of the MAS bit-slice is shown in figure 5.1. Unlabelled devices in the diagram represent asynchronous latches (fifo cells) or sinks. There is a source (labelled with a 0) which produces a stream of dual-rail false values. The switches are asymmetric: when the control line is a dual-rail false, data is passed directly through; when the control line is true, the...
right input is passed to the left output and no operation is performed on the other input and output. This circuit was modelled at the level of functional blocks. That is, behavioural VHDL was used to describe the net behaviour of each component, rather than building them up from gates or transistors.

The two control signals organise the different time scales in the multiplier, which functions in nested cycles at three levels. The lowest level is the cycle described above which occurs once per serial bit. At each iteration of this cycle, both C1 and C2 receive a single T value, followed by an empty. The middle level consists of actions which take place once per complete serial operand, such as loading a new parallel operand into the MAS block, and here C1 receives an F, while C2 still receives a T. The highest cycle hinges on an operand Last, which is supplied with each operand pair and tells the multiplier whether or not to convert the accumulated sum from carry save into binary representation and produce it as a result. This results in extending the serial operand with a number of zeroes and then producing F's on both C1 and C2.

These multiple cycles complicate the formalism developed in the preceding chapters. Theorem 3 requires a period function p in order to show that we can verify unbounded traces. However, in this example there is no clear p. If we consider the mid-level cycle which occurs once per operand pair, we see that an output is only produced by the vector multiplier when the Last input is true. This means that the number of actions taken by the output components is data dependent. To avoid this problem we use the outermost cycle (Last true to Last true) with a fixed number of operand pairs to define the period. However, this also fails to produce a well defined function p, as we do not know the number of operand pairs which will be supplied to the multiplier between Last's. Thus, we can show that
Figure 5.1: Bit Slice of the MAS Block (from [SS93])
for a particular $n$, the multiplier will function if $n$ input pairs are provided before the output is produced. Since the initial and final states satisfy the same $G$, we can also assert that the system works for sequences of a particular $n$. Table 5.1 below shows verification times for small $n$'s. The rapid increase verification times with the number of pairs is a result of the OBDD representations of the accumulated sum. A OBDD's size increases considerably when its value depends on the interaction of many variables, and the most significant bits of the accumulated sum depend in a complicated fashion on all of the bits of each operand pair.

<table>
<thead>
<tr>
<th>Device</th>
<th>STE Time (in s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Fifo (7 stages)</td>
<td>3</td>
</tr>
<tr>
<td>Adder (4 bit)</td>
<td>4</td>
</tr>
<tr>
<td>Vector Multiplier (1 pair)</td>
<td>14</td>
</tr>
<tr>
<td>Vector Multiplier (2 pairs)</td>
<td>22</td>
</tr>
<tr>
<td>Vector Multiplier (3 pairs)</td>
<td>602</td>
</tr>
<tr>
<td>Vector Multiplier (4 pairs) stopped after 340 min</td>
<td></td>
</tr>
</tbody>
</table>

Table 5.1: STE Verification Times

Initially, we had hoped to use the middle level (once per serial operand) cycles to define $p$, and then to describe the possible behaviours of holding the sum in carry save form or exporting it after conversion in a complicated local formula $L$. The difficulty arises due to the fact that the multiplier may or may not produce the result, which means that we need to include the component which outputs the results, but it may undergo 2 or 0 transitions depending on the value of the input Last.

This example had the most complicated periodic global formula $G$. Because of rings contained in the circuit, not all components return to an empty state. Correctly determining $G$ was the most time consuming aspect of verifying the vector
multiplier, although this included finding a suitable initial state for the system. Part of the difficulty was caused by the existence of periodic global formulas which do not correctly initialise the system (and result in incorrect function).

Part of the difficulty in describing the global formula comes from the restrictions for the trajectory formulas used in STE. Disjunction is not easily expressed in STE, and as a result the global formulas have to be described exactly as a function of the inputs and initial state. This requires a precise understanding of the circuit and its cycles. For example, for the vector multiplier the data values wait in different points of the circuit depending on the value of the \textit{Last} variables associated with the last two operand pairs. Including disjunction would allow the global formula to be a general predicate which includes all of the possible final states, although it still would not handle the fact that the output behaviour depends on the input data.
Chapter 6

Conclusions

This thesis presents an approach to verifying speed-independent circuits that exploits the observation that the sequence of operations performed by each component of the circuit is independent of the order in which the operations are performed. The specifications have three components: a local formula, a period function, and a periodic global formula. The local formula and period function capture the desired input/output behaviour of the circuit and are usually simple and intuitive to write. The periodic global formula describes a set of states that are equivalent to the initial state with respect to the handshaking protocol. Typically, the global formula is not an "interesting" part of the specification, but it is needed for efficient verification. While it may be complicated and depend on user knowledge, the knowledge which is required is a detailed understanding of the circuit's function. So, while the system is not automatic, it requires information which the system designer is likely to have (as opposed to requiring a deep understanding of formal logic).

We have implemented the verification techniques described in this thesis using a combination of model checking and symbolic trajectory evaluation. Model
checking is used to show that the design is speed-independent. These tests are performed separately for small sub-circuits in the design to avoid the state-space explosion problem. Symbolic trajectory evaluation is used to verify the local formulas and the periodicity of the global formula. This are done using a deterministic circuit model that allows large designs to be handled efficiently. Although symbolic trajectory evaluation was used for the given examples, the framework of local and global formulas provides a general basis for applying verification methods designed for deterministic (i.e. synchronous) models to speed-independent circuits. Using the Voss Prover (STE combined with a specialised theorem prover) would significantly increase the size of circuit which can be verified by this method.

As examples of this technique, several speed-independent data-path designs have been verified, including a simple FIFO, two designs for an adder, and a vector multiplier chip.

6.1 Future Work

The inability of this theoretical framework to completely verify the vector multiplier (that is, verifying correct behaviour with an arbitrary number of input pairs before producing an output) points out where shortcomings could be addressed. We believe that the theory can be extended to allow multiple global formulas \( \{G_1, G_2, \ldots, G_n\} \) with local formulas describing the input and output sequences which move the system from one to the other. This creates an abstract finite state machine where the arcs are verified local formulas describing transitions from \( G_i \) to \( G_j \). With this added structure the vector multiplier can be thought of as having 2 recurring states: the empty state after producing an output; and the state where it has a partial sum stored and is waiting for a new input. Figure 6.1 represents the corresponding
abstract state machine.

Figure 6.1: Vector Multiplier: Abstract State Machine

A simple theorem prover could be used to manipulate results and record proof obligations. If the check for speed-independence and trajectory evaluation were included as tools for showing base-level facts, results could be combined using specialised rules in a manner analogous to the Voss Prover [HS95]. A further extension would be a formal framework in which subsystems of the circuit could be verified independently. For the vector multiplier this would allow us to verify the multiplication in the MAS block one bit at a time (as in [AS95]), permitting the verification of systems with much larger data-paths.

This research could also be extended into other models for asynchronous circuits. Delay-insensitive circuits, which allow both components and wires to have arbitrary delays, are a subcategory of speed-independent circuits, and can already be verified using this method. More general classes of asynchronous circuits would require more structure, but any hazard-free circuit should be possible to verify using a synchronous model. For example, hazard-free timed circuits are still deterministic. One possibility would be to show that they are a refinement of a speed-
independent specification using model checking [LGS94b], or the separation of events in time [HB+94], before verifying synchronously.

Finally, some circuits are not speed-independent at their lowest level, but have a deterministic model to which they should correspond. An example is the Counter-Flow Pipeline Processor [SSM94], which contains arbiters and is therefore not speed-independent. There are two approaches which may allow us to work with these circuits. Arbiters could be modelled by speed-independent components with a choice variable supplied by the environment that determines the direction of the grant in the event of two requests. Correct behaviour can then be shown regardless of the choice variables supplied, which shows correct function given any permitted behaviour by the arbiter. The other approach would be to observe invariants in system behaviour which allow a formal equivalence to be shown between the circuit and a deterministic model which could be verified using STE. The Counter-Flow Pipeline Processor might be accessible using this method. Invariants could be shown based on the pipeline rules [SSM94] which show that each instruction executes in an environment where it appears that all previous instructions have executes. However, this approach would require sophisticated users to develop specialised invariants for each circuit.
Bibliography


Appendix A

Proofs

The actions of simultaneously enabled components commute without changing the values of local formulas. This fact is the keystone of many of the theorems in this paper, and is formalised by the following lemma.

Lemma 1 (Commuting Actions) Given an eligible circuit \((V, R, C, E, Q_0)\), a weakly fair trace \(\alpha\), and a natural number \(i\) such that \(\alpha^{i+2}\) exists, let \(c\) and \(d\) be the components such that \((\alpha^i, \alpha^{i+1}) \in R_c\), \((\alpha^{i+1}, \alpha^{i+2}) \in R_d\), and \(d\) is enabled in state \(\alpha^i\). If \(\beta\) is defined to be the sequence of states such that

\[
\beta^k = \alpha^k \quad \text{if } k \neq i + 1
\]

\[
v(\beta^{i+1}) = v(\alpha^{i+2}) \quad \text{if } v \in O_d
\]

\[
v(\beta^{i+1}) = v(\alpha^i) \quad \text{if } v \notin O_d
\]

then \(\beta\) is a weakly fair trace, and \(\alpha \equiv \beta\).

Proof: To provide structure for this proof it has been divided into sections. Here is an outline of the proof obligations and their structure.

1. First, we must show that \(\beta\) is a trace. This requires two steps.
• We need to show that $(\beta^i, \beta^{i+1}) \in R_d$. There are two cases.
  
  - $d$ has no output choice
  
  - $d$ has output choice

• Next we show that $(\beta^{i+1}, \beta^{i+2}) \in R_c$.

2. We have to show that $\alpha \cong \beta$ by showing that $v^k_e(\alpha) = v^k_e(\beta)$ for any component $e$ with stable inputs. This is broken up into two cases.

  • $e \in C \cup E - \{c, d\}$
  
  • $e \notin C \cup E - \{c, d\}$

3. Finally, we show that $\beta$ is weakly fair

The proof will follow this outline.

The following is a review of some facts which will be used repeatedly and without further justification throughout this proof. First observe that if $d = c$ then $\alpha = \beta$, and therefore $\alpha \cong \beta$ and $\beta$ is a weakly fair trace. Thus, we will assume that $d \neq c$ for the rest of the proof. Since $\alpha$ is a trace, we know that $\text{reachable}(\alpha^i)$ is true. By the definition of $\beta$, for all $k \neq i+1$ we know that $\beta^k = \alpha^k$, and from now on we will write $\alpha^i$ and $\alpha^{i+2}$ for $\beta^i$ and $\beta^{i+2}$. Finally, since $d$ is enabled in state $\alpha^i$, we know that there exists at least one transition $(\alpha^i, s) \in R_d$. We will use $s$ to denote the final state of this transition.

Showing that $\beta$ is a trace

 Showing that $(\alpha^i, \beta^{i+1}) \in R_d$
Case 1: $d$ has no output choice

By speed-independence and the fact that $(\alpha^i, \alpha^{i+1}) \in R_c$, we can see that there exists an element of $R_d$ which starts at $\alpha^{i+1}$ and produces the same outputs as the transition $(\alpha^i, s)$. That is,

$$\exists s' \in S. (\alpha^{i+1}, s') \in R_d \land (\forall v \in O_d. v(s) = v(s'))$$

The absence of output choice implies

$$\forall (s, s^1), (s, s^2) \in R_d. (s^1 = s^2)$$

In particular, $\alpha^{i+2} = s'$, implying that

$$\forall v \in O_d. v(s) = v(\alpha^{i+2})$$

which, by the definition of $\beta$ shows that $\beta^{i+1} = s$, and therefore $(\alpha^i, \beta^{i+1}) \in R_d$.

Case 2: $d$ has output choice

Here we know that $d$ has stable inputs. Since $(\alpha^i, \alpha^{i+1}) \in R_c$, this yields

$$\forall v \in I_d \cup O_d. v(\alpha^i) = v(\alpha^{i+1})$$

Because $d$'s inputs and outputs are the same in both $\alpha^i$ and $\alpha^{i+1}$, $d$ must be able to perform the same actions.

$$\exists s' \in S. ((\alpha^i, s') \in R_d \land \forall v \in O_d. v(s') = v(\alpha^{i+2}))$$

Because $(\alpha^i, s') \in R_d$, we know that

$$\forall v \not\in O_d. v(s') = v(\alpha^i)$$

and can see that $s' = \beta^{i+1}$, and therefore $(\beta^i, \beta^{i+1}) \in R_d$. 

64
Showing that $(\beta^{i+1}, \alpha^{i+2}) \in R_c$

By the definition of speed-independence and the fact that $c$ was enabled in $\alpha^i$, we know that

$$\exists s' \in S. ( (\beta^{i+1}, s') \in R_c) \wedge (\forall v \in O_c. v(s') = v(\alpha^{i+2}))$$

Since only outputs of component $c$ can change in a transition in $R_c$, we can see that the $s'$ described above is equal to $\alpha^{i+2}$ and therefore $(\beta^{i+1}, \alpha^{i+2}) \in R_c$.

Showing that $\alpha \equiv \beta$

We will show that $v^k_e(\alpha) = v^k_e(\beta)$ for any component $e$ with stable inputs.

Case 1: $e \notin \{c, d\}$

Note that for all numbers $k$ where $k \neq i + 1$ we have $\alpha^k = \beta^k$, and that the index of $e$ is never $i + 1$. This implies

$$\forall e \in C \cup E - \{c, d\}, \forall v \in I_e \cup O_e, \forall k \in N. \text{occur}(e, \alpha, k) = \text{occur}(e, \beta, k)$$

and therefore

$$\forall e \in C \cup E - \{c, d\}, \forall k \in N.$$ valid($v^k_e, \alpha$) = valid($v^k_e, \beta$) $\wedge$ valid($v^k_e, \alpha$) $\Rightarrow$ ($v^k_e(\alpha) = v^k_e(\beta)$)

which is the definition of $\alpha \equiv \beta$.

Case 2: $e \in \{c, d\}$

Now suppose that $e = c$ and that $c$ has stable inputs. If $c$ does not have stable inputs, then $v^k_c$ cannot appear in a local formula and we are done.
Let
\[ n_c = \text{occur}(c, \alpha, i) = \text{occur}(c, \beta, i) \]

Considering the definition of \( \beta \), we can see that
\[
\forall k \in N. k \neq n_c + 1 \Rightarrow \text{index}(c, \alpha, k) = \text{index}(c, \beta, k) \\
\wedge \text{index}(c, \alpha, n_c + 1) = i + 1 \tag{A.1} \\
\wedge \text{index}(c, \beta, n_c + 1) = i + 2
\]
and therefore
\[
\forall k \in N. \forall v \in O_c. \ (k \neq n_c + 1) \wedge \text{valid}(v^k_c, \alpha) \Rightarrow (v^{\text{index}(c, \alpha, k)} = v^{\text{index}(c, \beta, k)})
\]

Now suppose that \( k = n_c + 1 \). Recall from the discussion in section 2.2 that assertions about the outputs of a component after its \( k^{th} \) action are written as \( v(t^{k+1}_c) \) (or equivalently \( v(t^k_c.post) \)), while assertions about the inputs to a component at the time of its \( k^{th} \) action must be written as \( v(t^k_c) \). This means that
\[
\forall v \in O_c. v^{k+1}_c(\alpha) = v^{\text{index}(c, \alpha, n_c+2)-1} = v^{\text{index}(c, \alpha, n_c+1)} = v^{\alpha^{i+1}} \\
\forall v \in O_c. v^{k+1}_c(\beta) = v^{\text{index}(c, \beta, n_c+2)-1} = v^{\text{index}(c, \beta, n_c+1)} = v^{\beta^{i+2}}
\]
but by the definition of \( \beta \) and the fact that \( d \neq c \) we know that
\[
\forall v \in O_c. (v^{\alpha^{i+1}} = v^{\beta^{i+2}})
\]
which shows that for \( v \in O_c \) we have \( v^k_c(\alpha) = v^k_c(\beta) \). The final case is \( v \in I_c \), where we can see that
\[
\forall v \in I_c. v^k_c(\alpha) = v^{\text{index}(c, \alpha, n_c+1)-1} = v^{\alpha^i} \\
\forall v \in I_c. v^k_c(\beta) = v^{\text{index}(c, \beta, n_c+1)-1} = v^{\beta^i+1}
\]
Since we assumed that \( c \) has stable inputs and we know that \( c \) was enabled in state \( \alpha^i \), we know that

\[
\forall v \in I_c. v(\beta^i) = v(\beta^{i+1})
\]

and that therefore for \( v \in I_c \) we have \( v_c^k(\alpha) = v_c^k(\beta) \). The argument for \( e = d \) is similar, and can be created from the above argument by changing equation A.1 to read

\[
\forall k \in N. k \neq n_d + 1 \Rightarrow index_d(\alpha, k) = index_d(\beta, k)
\]

\[
\land \quad \text{index}_d(\alpha, n_d + 1) = i + 2
\]

\[
\land \quad \text{index}_d(\beta, n_d + 1) = i + 1
\]

This shows that \( \alpha \equiv \beta \).

**Arguing that \( \beta \) is weakly fair**

The argument proceeds by contradiction. Consider a component \( e \). Suppose that \( \beta \) is not weakly fair, and that there is an \( k \) such that \( e \) is enabled in \( \beta^k \) and that \( e \) remains enabled for all \( j \geq k \). Pick a number \( n > k \) such that \( n > i + 2 \). We know that \( e \) is enabled in \( \beta \) for all \( j > n \). However, for all \( j > i + 2 \) we know that \( \alpha^j = \beta^j \), and so \( e \) is enabled in \( \alpha \) for all \( j \geq n \) as well.

This is a contradiction because we know that \( \alpha \) is weakly fair.

This concludes the proof.

\( \square \)

The next lemma uses repeated commutation operations to move an action an arbitrary number of steps.

**Lemma 2 (Splicing Step)** Given an eligible circuit \((V, R, C, E, Q_0)\), If \( t \) and \( u \) are weakly fair traces and \( m \) is a natural number such that \( t^m \) exists, splice\((t, u, m - \)
1) \( \cong u \), and \( \text{splice}(t, u, m - 1) \cong_E t \), then \( \text{splice}(t, u, m) \cong \text{splice}(t, u, m - 1) \) and \( \text{splice}(t, u, m) \) is a weakly fair trace.

**Proof:** By applying definition 9 (splicing) and the fact that \( t \) and \( u \) are weakly fair traces, we can see that \( \text{splice}(t, u, m) \) is the sequence of states

\[
\begin{align*}
\text{splice}(t, u, m)^j &= t^j & \text{if } j < m \\
v(\text{splice}(t, u, m)^j) &= v(t^k) & \text{if } (m \leq j < k) \land v \in O_c \\
v(\text{splice}(t, u, m)^j) &= v(u^j) & \text{if } (m \leq j < k) \land v \not\in O_c \\
\text{splice}(t, u, m)^j &= u^j & \text{if } k \leq j
\end{align*}
\]

where \( c \) is the component such that \( (t_{m-1}, t^m) \in R_c, n_c = \text{occur}(c, u, m - 1) \), and \( k = \text{index}(c, u, n_c + 1) \).

First, we assert that \( k \) exists. This is a consequence of the fact that \( u \) is weakly fair and \( c \) is enabled in state \( u^{m-1} \). The proof proceeds by induction on \( k \).

The base case is \( k = m \). Because the circuit is eligible, only components in \( E \) have output choice, so if \( c \not\in E \) then

\[
\forall s \in S. (t^{m-1}, s) \in R_c \Rightarrow s = t^m
\]

which combined with \( t^{m-1} = u^{m-1} \) gives us \( u^m = t^m \). If \( c \in E \), we know that \( t \cong_u u \).

By \( t^{m-1} = u^{m-1} \) and the definition of choice consistency, we can see that

\[
\begin{align*}
\forall v \in O_c. v(t^m) &= v(u^m) \\
\land \quad \forall v \not\in O_c. v(t^m) &= v(t^{m-1}) = v(u^{m-1}) = v(u^m)
\end{align*}
\]

which is enough to see that \( u^m = t^m \).

For the induction step, we know that \( \text{index}(c, u, n_c + 1) = k \), and that if we can create a weakly fair trace \( \alpha \) such that \( \alpha \cong \text{splice}(t, u, m) \) and \( \text{index}(c, \alpha, 1) = k - 1 \) then by the induction hypothesis we would have a proof of our claim. However,
by speed-independence we know that that component $c$ is enabled in all states $t^{m-1}, \ldots, t^{k-1}$, which allows us to apply lemma 1 and exchange the actions of $c$ and the previous component. This creates the required trace $\alpha$.

\[ \square \]

**Theorem 1 (Splicing)** Given an eligible circuit $(V, R, C, E, Q_0)$, let $t$ and $u$ be weakly fair traces with $m$ a natural number such that $t^m$ exists. If $t^0 = u^0$ and $t \sqsubseteq_E u$, then $\text{splice}(t, u, m)$ is a weakly fair trace and $\text{splice}(t, u, m) \equiv u$.

**Proof:** The proof proceeds by induction. The base case is $m = 0$ which is discharged by observing that $u = \text{splice}(t, u, 0)$. For the induction step we have $\text{splice}(t, u, m - 1) \sqsubseteq_E t$, $\text{splice}(t, u, m - 1)$ is a weakly fair trace, and $u \equiv \text{splice}(t, u, m - 1)$. Since $t$ and $\text{splice}(t, u, m - 1)$ satisfy the conditions of lemma 2 we can see that $\text{splice}(t, u, m)$ is a weakly fair trace and that $\text{splice}(t, u, m) \equiv \text{splice}(t, u, m - 1)$. Since $\text{splice}(t, u, m - 1) \sqsubseteq_E t$ and $\text{splice}(t, u, m - 1) \equiv u$, this gives $\text{splice}(t, u, m) \sqsubseteq_E t$ and $\text{splice}(t, u, m) \equiv u$.

\[ \square \]

When discussing traces from $\text{traces}(R, G)$ and $\text{traces}(R^+, G)$ it is convenient to describe two traces $t \in \text{traces}(R, G)$ and $u \in \text{traces}(R^+, G)$ as **corresponding** if $t$ is the trace formed by concatenating the execution in some order the components active in all of the changes $(u^i, u^{i+1})$, and where the choices made by elements in $t$ are the same as those made by elements of $u$. To formally define corresponding traces, we define notation for describing **active components**, i.e. those components which jointly perform a transition in $u \in \text{traces}(R^+, G)$. Note that we use $|u|$ to denote the number of states in a trace $u$, and $|D|$ to denote the number of elements...
of a set $D$.

**Definition 15 (Active Components)** Let $(V, R, C, E, Q_0)$ be an eligible circuit, $u$ be trace in traces$(R^+, G)$, and $t$ be a trace in traces$(R, G)$. Let $D^i$ denote the set of components active in the transition from $u^{i-1}$ to $u^i$

$$D^i = \{ c \in C \mid \exists v \in O_c, v(u^{i-1}) \neq v(u^i) \}$$

and $m_i$ denote the number of state changes required in $t$ to allow all of the components in all of the previous sets $D^j$ to act (with $D^0$ defined to be the empty set)

$$m_i = \sum_{j=0}^{i-1} |D^j|$$

**Definition 16 (Corresponding Traces)** Let $(V, R, C, E, Q_0)$ be an eligible circuit. Two traces $t \in$ traces$(R, G)$ and $u \in$ traces$(R^+, G)$ correspond if every component which performs an action in the trace $t$ performs the same action as part of a combined transition in the trace $u$, and these actions in $t$ are clustered in subsequences corresponding to each single action in $u$. That is,

$$\forall i, j \in N. m_i \leq j < m_{i+1} \Rightarrow (\exists d \in D^{i+1}. (t^j, t^{j+1}) \in R_d)$$

where $D^i$ and $m_i$ are as described in definition 15 above, and the choices made by each environment component are the same

$$\forall e \in E. \forall i \in N. (\forall v \in I_e. valid(v^i_e, t) \land v^i_e(t) = v^i_e(u)) \Rightarrow (\forall v \in O_e. v^i_e(t) = v^i_e(u))$$

The proof of theorem 4 (combined actions) requires the following lemma.

**Lemma 3 (Corresponding Traces)** Given an eligible circuit $(V, R, C, E, Q_0)$ and two weakly fair sequences $t \in$ traces$(R, G)$ and $u \in$ traces$(R^+, G)$, if $u$ corresponds to $t$ then $t \preceq u$. 

70
Proof: Because every local formula $L$ can be represented by terms of the form $v_c^k$ and the operators $\rightarrow$ and $\land$, if every term of the form $v_c^k$ has the same validity and value in $t$ and $u$ then, by induction on the structure of $L$, the lemma holds.

Let $v_c^k$ be a term of $L$, where $c$ is a component with stable inputs. We will show by induction on the index $i$ of the $k^{th}$ action of $c$ in $u$ that $v_c^k$ has the same validity and value in $u$ as in $t$. The induction hypothesis will be

$$u^i = t^m_i$$

$$\land \forall c \in C \cup E.$$  

$$(\text{occur}(c, u, i) = \text{occur}(c, t, m_i)) \land (v_c^{\text{index}(c, u, i)}(u) = v_c^{\text{index}(c, u, i)}(t))$$

Base: $i = 0$

This can be discharged by observing that $m_0 = 0$, $t^0 = u^0$, for every component $c$, $\text{occur}(c, u, 0) = \text{occur}(c, t, m_0) = 0$, and that every local formula $v_c^0$ is invalid in both $t$ and $u$.

Step: from $i$ to $i + 1$

From the induction hypothesis we know that $u^i = t^m_i$, and for any component $c$ we have $\text{occur}(c, u, i) = \text{occur}(c, t, m_i)$. Consider a component $d$ in $D^{i+1}$. If $d$ has no output choice, then because it is speed-independent and enabled in $t^m_i$ it must produce the same values on its output no matter what other components in $D^{i+1}$ have acted before it. If $d$ has output choice, it must be in the environment and thus have stable inputs. Because it is enabled in $t^m_i$ none of its inputs may be changed by the action of any other component, and because environment choices in $t$ and $u$ are defined to be the same it must produce the same output given the same inputs. The combination of the above facts shows that

$$\forall d \in D^i. \text{stable}(d) \Rightarrow (\forall v \in I_d \cup O_d. v_d^{\text{occur}(c, t, m_i) + 1}(t) = v_d^{\text{occur}(c, t, m_i) + 1}(u))$$

71
and \( u^{i+1} = t^{m_{i+1}} \). Since every element in \( D^i \) has acted once in both the trace \( u^{i-1}, \ldots, u^i \) and the trace \( t^{m_i}, \ldots, t^{m_{i+1}} \) and no other components have acted, we can see that for any component \( c \) we have \( \text{occur}(c, u, i + 1) = \text{occur}(c, t, m_{i+1}) \).

This proves the claim.

\( \Box \)

**Theorem 4 (Combined Actions)** Given an eligible circuit \( (V, R, C, E, Q_0) \), for all local formulas \( L \)

\[
(R, Q_0) \models L \iff (R^+, Q_0) \models L
\]

**Proof:** The proof will be broken down into two parts, one for each direction of the implication. First, we will show that for any \( L \)

\[
(R, Q_0) \models L \implies (R^+, Q_0) \models L
\]

Let \( u \) be any trace in \( \text{traces}(R^+, Q_0) \). We will create a weakly fair trace \( t \) in \( \text{traces}(R, Q_0) \) such that \( t \) and \( u \) correspond. Because we know that \( (R, Q_0) \models L \) we know that \( L \) is valid and holds in \( t \), which by correspondence gives us that \( L \) is valid and holds in \( u \). Since this is true for any \( u \) in \( \text{traces}(R^+, Q_0) \), this gives \( (R^+, Q_0) \models L \). Using \( m_i \) as described in definition 16, the trace \( t \) is created inductively using the induction hypothesis

\[
(u^i = t^{m_i}) \land \forall c \in C \cup E. \text{occur}(c, u, i) = \text{occur}(c, t, m_i)
\]

We start by setting \( t^0 = u^0 \), which then satisfies the base case of \( i = 0 \) because for all \( c \) we know that \( \text{occur}(c, u, 0) = \text{occur}(c, t, 0) = 0 \). The induction step proceeds by extending \( t \) by the actions of the components of \( D^i \) (again, as described in...
definition 16) in any order, and by having any components with output choice make the same choices in $t$ as they did in $u$. By its construction, $t$ corresponds with $u$.

To show that

$$(R^+, Q_0) \models L \Rightarrow (R, Q_0) \models L$$

let $t$ be any trace in $\text{traces}(R, Q_0)$. We will create traces $u$ in $\text{traces}(R^+, Q_0)$ and $\alpha$ in $\text{traces}(R, Q_0)$ such that $\alpha$ corresponds to $u$, and $t \equiv \alpha$. Because $(R, Q_0) \models L$ we know that $L$ is valid and holds in all traces $u$ in $\text{traces}(R^+, Q_0)$, and from the fact that $u$ corresponds to $\alpha$ and $\alpha \equiv t$, we can conclude that $L$ is valid and holds in $t$. Since $t$ and $L$ were arbitrary, we will have proven our claim.

We construct $u$ and $\alpha$ inductively in parallel, with the induction hypothesis that the subtraces $u^0, \ldots, u^i$ and $\alpha^0, \ldots, \alpha^{m_i}$ correspond and that

$$\forall c \in C \cup E. \forall v \in I_c \cup O_c. \forall k \in N. \text{valid}(v^k_c, \alpha) \Rightarrow (v^k_c(t) = v^k_c(\alpha)) \quad (A.3)$$

Start by setting $u^0 = t^0 = \alpha^0$. This trivially gives us the base case. For the induction step, let $D^i$ be the set of components enabled in $t^{m_i-1}$. Since $t$ is weakly fair, these actions are all performed at some point in $t$. Extend $\alpha$ by appending after $\alpha^{m_i-1}$ the actions of the elements of $D^i$, acting in any order, and requiring that if they have output choice they produce the same outputs in $\alpha$ as in $t$. This is possible because they are all performed in $t$ and enabled in $t^{m_i-1}$. Now let $u^i = \alpha^{m_i}$. Because $\alpha$ was formed from $t$ using commuting operations equation A.3 holds, and by construction $\alpha^0, \ldots, \alpha^{m_i}$ corresponds to $u^0, u^1$. In this manner we can create $\alpha$ and $u$ such that $u$ corresponds to $\alpha$ and $\alpha \equiv t$, which proves our claim.
Appendix B

Source Code

This appendix contains the source code for the files used in the verification of the FIFO, the two adders, and the vector multiplier which is described in the body of the thesis. Also included is the Synchronised Transitions (ST) code used to describe the components for the model checking to show that they are speed-independent. Except for the three files describing the FIFO cell, source, and sink, all files are contained in this appendix. An inventory of the files involved is as follows:

Behavioural VHDL building blocks for the FIFO and the two adders

These are all speed-independent components using dual-rail codes for their data.

source.vbe This describes a data source. The VHDL code is contained in Chapter 4.
sink.vbe This describes a data sink. The VHDL code is contained in Chapter 4.
fifocell.vbe This describes a stage in a FIFO chain. The VHDL code is contained in Chapter 4.
maj3.vbe This describes a component for computing the majority of three values.
vjoin.vbe This describes a C-element.
xor3.vbe This describes a component for computing the exclusive or of two values.
Structural VHDL describing complete FIFO chain and adders

- **fifochain.vst** This describes the structure of the fifo chain.
- **add4.vst** This describes the structure of the buffered 4 bit adder.
- **fadd4.vst** This describes the structure of the unbuffered 4 bit adder.

FL code describing verification of FIFO and the two adders

- **aspec.fifochain.fl** This code verifies the behaviour of the fifo chain.
- **aspec.add4.fl** This code verifies the behaviour of the buffered 4 bit adder.
- **aspec.fadd4.fl** This code verifies the behaviour of the unbuffered 4 bit adder.

Synchronised Transitions code describing FIFO and adder components

- **mc.sti** This contains ST code describing the components which are implemented in behavioural VHDL.

Behavioural VHDL building blocks for the vector multiplier

These are all speed-independent components using dual-rail codes for their data. They use a slightly different handshaking protocol, so there is some repetition from the components developed for the FIFO and adders.

- **src.vbe** This describes a data source which provides a single bit of data and then stops.
- **srcn.vbe** This describes a data source which can produce multiple bits of independent data before stopping.
- **sink.vbe** This describes a data sink.
- **alatch.vbe** This describes an asynchronous latch.
- **aswitch.vbe** This describes an asymmetric switch.
- **fadder.vbe** This describes a full adder.
- **fand.vbe** This describes an AND gate.
- **fork.vbe** This describes a fork element.
- **mux.vbe** This describes a multiplexing component.
- **fl.vbe** This describes a component for computing a special function for the control portion of the vector multiplier.
f2.vbe This describes a component for computing a special function for the control portion of the vector multiplier.

**Structural VHDL describing the building blocks of the vector multiplier**

pisobit.vst This describes a bit slice of the PISO (parallel in, serial out) component used in the control block of the vector multiplier. The VHDL code is contained in this appendix.

piso4.vst This describes a 4 bit PISO component used in the control block of the vector multiplier.

control.vst This describes the complete control block for the vector multiplier.

masbit.vst This describes a bit-slice of the MAS block in the vector multiplier.

masblock.vst This describes an 8 bit MAS block for the vector multiplier.

vm.vst This describes the complete vector multiplier (without data sinks and sources).

testvm.vst This describes the addition of sinks and sources to the vector multiplier to allow verification.

**FL code used in the verification of the vector multiplier**

complib.fl This contains FL code for making assertions about the basic multiplier circuit components.

blocklib.fl This contains FL code for making assertions about the major blocks of the vector multiplier.

aspec.vm.fl This contains FL code to verify the behaviour of the vector multiplier.

**Synchronised Transitions code describing vector multiplier components**

mc.sti This contains ST code describing the components which are implemented in behavioural VHDL.
ENTITY maj3 IS
PORT (
    aTB : IN BIT;
    aFB : IN BIT;
    DTB : IN BIT;
    DFB : IN BIT;
    cTB : IN BIT;
    cFB : IN BIT;
    outTB : INOUT BIT;
    outFB : INOUT BIT;
    v : OUT BIT;
    sv : IN BIT
); END maj3;

ARCHITECTURE data_flow OF maj3 IS
SIGNAL gout : REG_BIT REGISTER;
SIGNAL gnot : REG_BIT REGISTER;
BEGIN
    -- Transition for passing emtpies
    passE : BLOCK ( aTB AND aFB AND DTB AND DFB AND cTB AND cFB AND sv )
    BEGIN
        gout <= GUARDED '0';
        gnot <= GUARDED '0';
    END BLOCK;
    -- Transition for passing values
    -- Can execute whenever two valid values agree
    passV : BLOCK ( ( NOT ( aTB OR DTB ) OR NOT ( aFB OR DFB )
                       OR NOT ( aTB OR cTB ) OR NOT ( aFB OR cFB )
                       OR NOT ( DTB OR cTB ) OR NOT ( DFB OR cFB )
                       AND NOT sv )
    BEGIN
        gout <= GUARDED ( aTB OR DFB ) AND ( aFB OR cFB ) AND ( bFB OR cFB )
        gnot <= GUARDED ( aTB OR DTB ) AND ( aFB OR cTB ) AND ( bTB OR cTB )
    END BLOCK;
    outTB <= NOT gout;
    outFB <= NOT gnot;
    v <= gout OR gnot;
END data_flow;

ARCHITECTURE data_flow OF vjoin IS
SIGNAL cont : REG_BIT REGISTER;
BEGIN
    -- C element for joining valid codes
    cel : BLOCK ( NOT ( v0 XOR v1 ) )
    BEGIN
        cont <= GUARDED v0;
        vout <= cont;
    END BLOCK;
END data_flow;
-- This is a behavioural description for an asynchronous 3 bit xor

ENTITY xor3 IS
PORT ( aTB : IN BIT; eFB : IN BIT; bTB : IN BIT; bFB : IN BIT; cTB : IN BIT; cFB : IN BIT; outTB : INPUT BIT; outFB : INPUT BIT; v : OUT BIT; sv : IN BIT );
END xor3;

ARCHITECTURE data_flow OF xor3 IS
  SIGNAL gOut : REG_BIT REGISTER;
  SIGNAL gOut : REG_BIT REGISTER;
BEGIN
  -- Transition for passing empty
  passE : BLOCK ( aTB AND aFB AND bTB AND bFB AND cTB AND cFB AND sv )
BEGIN
    gOut <= GUARDED '0';
    gOut <= GUARDED '0';
END BLOCK;
  -- Transition for passing values
  passV : BLOCK ( NOT ( aTB AND aFB ) OR ( bTB AND bFB ) OR ( cTB AND cFB ) ) AND NOT sv
BEGIN
    gOut <= GUARDED ( NOT ( aTB XOR bTB XOR cTB ));
    gOut <= GUARDED ( NOT ( aFB XOR bFB XOR cFB ));
END BLOCK;
  outTB <= NOT gOut;
  outFB <= NOT gOut;
  v <= gOut OR gOut;
END data_flow;

-- VHDL structural description of a complete chain of fifo cells

ENTITY fifo IS
PORT ( data : IN BIT );
END fifo;

ARCHITECTURE structural_view OF fifo IS
COMPONENT fifoCell
PORT ( intb : IN BIT; intb : IN BIT; outb : OUT BIT; outb : OUT BIT; v : OUT BIT; sv : IN BIT );
END COMPONENT;

COMPONENT source
PORT ( sv : IN BIT; outb : OUT BIT; outb : OUT BIT; data : IN BIT );
END COMPONENT;

COMPONENT sink
PORT ( intb : IN BIT; intb : IN BIT; sv : IN BIT );
END COMPONENT;

SIGNAL TB0 : BIT;
SIGNAL FB0 : BIT;
SIGNAL TB1 : BIT;
SIGNAL FB1 : BIT;
SIGNAL v1 : BIT;
SIGNAL TB2 : BIT;
SIGNAL FB2 : BIT;
SIGNAL v2 : BIT;
SIGNAL TB3 : BIT;
SIGNAL FB3 : BIT;
SIGNAL v3 : BIT;
SIGNAL TB4 : BIT;
SIGNAL FB4 : BIT;
SIGNAL v4 : BIT;
SIGNAL TB5 : BIT;
SIGNAL FB5 : BIT;
SIGNAL v5 : BIT;
SIGNAL TB6 : BIT;
SIGNAL FB6 : BIT;
SIGNAL v6 : BIT;
SIGNAL TB7 : BIT;
SIGNAL FB7 : BIT;
SIGNAL v7 : BIT;
SIGNAL vout : BIT;
-- VHDL structural description of a complete 4 bit async adder

-- ENTITY add4 IS  
PORT(  
cinStr : IN BIT;  
a0Str  : IN BIT;  
b0Str  : IN BIT;  
a1Str  : IN BIT;  
b1Str  : IN BIT;  
a2Str  : IN BIT;  
b2Str  : IN BIT;  
a3Str  : IN BIT;  
b3Str  : IN BIT;  
);  
END add4;  
ARCHITECTURE structural_view OF add4 IS  
COMPONENT fifo4call  
PORT (  
inTB    : IN BIT;  
inFB    : IN BIT;  
outTB   : OUT BIT;  
outFB   : OUT BIT;  
v      : OUT BIT;  
sv      : IN BIT;  
);  
END COMPONENT;  
COMPONENT xor3  
PORT (  
sb     : IN BIT;  
sFB    : IN BIT;  
bf      : IN BIT;  
bFB     : IN BIT;  
cFB     : IN BIT;  
outTB   : INOUT BIT;  
outFB   : INOUT BIT;  
v      : OUT BIT;  
sv      : IN BIT;  
);  
END COMPONENT;  
COMPONENT maj4  
PORT (  
sb     : IN BIT;  
sFB    : IN BIT;  
bF      : IN BIT;  
bFB     : IN BIT;  
cF      : IN BIT;  
outTB   : INOUT BIT;  
outFB   : INOUT BIT;  
v      : OUT BIT;  
sv      : IN BIT;  
);  
END COMPONENT;  
COMPONENT vjoin  
PORT (  
v0     : IN BIT;  
v1     : IN BIT;  
vout   : INOUT BIT;  
);  
END COMPONENT;  
COMPONENT source  
PORT (  
v      : IN BIT;  
outTB  : OUT BIT;  
outFB  : OUT BIT;  
data   : IN BIT;  
);  
END COMPONENT;  
COMPONENT sink  
PORT (  
inTB    : IN BIT;  
inFB    : IN BIT;  
v      : INBIT;  
);  
END COMPONENT;  
-- variables are labelled xxx<layer><bit><signal>,  
-- as in v21b = valid layer 2 bit 3 signal b (sum is blank)  
SIGNAL v00, v01a, v01b, v02a, v02b, v03a, v03b, v000, v001 : BIT;  
SIGNAL v20, v21, v22a, v22b, v23a, v23b, v222, v223 : BIT;  
SIGNAL v30, v31, v32a, v32b, v311, v311, v312, v313 : BIT;  
SIGNAL v40, v41, v42, v43, v44 : BIT;  
SIGNAL CT800, CF800 : BIT;  
SIGNAL CT81, CF81 : BIT;  
SIGNAL CT82, CF82 : BIT;  
SIGNAL CT83, CF83 : BIT;  
SIGNAL CT84, CF84 : BIT;  
SIGNAL AT800, AF800, DT800, DF800, AT801, AF801, DT801, DF801 : BIT;  
SIGNAL AT802, AF802, DT802, DF802, AT803, AF803, DT803, DF803 : BIT;  
SIGNAL AT810, AF810, AT811, AF811, DT812, DF812, AT813, AF813, DT813, DF813 : BIT;  
SIGNAL AT820, AF820, AT821, AF821 : BIT;  
SIGNAL AT822, AF822, DT822, DF822, AT823, AF823, DT823, DF823 : BIT;  
SIGNAL AT830, AF830, AT831, AF831 : BIT;  
SIGNAL AT832, AT832, AT833, AF833, DT833, DF833 : BIT;  
SIGNAL AT840, AF840, AT841, AF841 : BIT;  
SIGNAL AT842, AT842, AT843, AF843 : BIT;  
BEGIN  
-- input nodes  
srccq : source  
PORT MAP (  
v  => v00,  
outTB => CT800,  
outFB => CF800,  
);
bfb  => rfb13,
cfb  => rfb13,
cfb  => rfb33,
outb  => rfb41,
outfb  => rfb43,
v  => v233,
sv  => v43
);

major : major
PORT MAP (
sb  => sb13,
sfb  => sb13,
dfb  => sb13,
bfb  => sb13,
cfb  => sb13,
outb  => sb41,
outfb  => sb43,
v  => v233,
sv  => v43
);

-- sink elements

sink0 : sink
PORT MAP ( 
inb  => sb40,
infb  => sb40,
v  => v40
);

sink1 : sink
PORT MAP ( 
inb  => sb41,
infb  => sb41,
v  => v41
);

sink2 : sink
PORT MAP ( 
inb  => sb42,
infb  => sb42,
v  => v42
);

sink3 : sink
PORT MAP ( 
inb  => sb43,
infb  => sb43,
v  => v43
);

sink4 : sink
PORT MAP ( 
inb  => sb44,
infb  => sb44,
v  => v44
);

END structural_view;

-- sid: fadd4.vat,v 1.1 1995/12/08 21:01:35 with Exp with $
-- VHDL structural description of a 'flat' 4 bit asyn adder

ENTITY fadd4 IS
PORT(
inStr  : IN BIT;
s0Str  : IN BIT;
s1Str  : IN BIT;
s2Str  : IN BIT;
s3Str  : IN BIT;
outStr : OUT BIT;
outFB  : OUT BIT;
v  : OUT BIT;
sv  : IN BIT
);

END fadd4;

ARCHITECTURE structural_view OF fadd4 IS

COMPONENT major
PORT ( 
sb  : IN BIT;
sfb  : IN BIT;
dfb  : IN BIT;
bfb  : IN BIT;
cfb  : IN BIT;
outb  : INOUT BIT;
outfb  : INOUT BIT;
v  : OUT BIT;
sv  : IN BIT
);

END COMPONENT;

COMPONENT sink
PORT ( 
inb  : IN BIT;
infb  : IN BIT;
v  : OUT BIT;
sv  : IN BIT
);

END COMPONENT;

COMPONENT vjjoin
PORT ( 
v0  : IN BIT;
v1  : IN BIT;
vo  : INOUT BIT
);

END COMPONENT;

COMPONENT source
PORT ( 
sv  : IN BIT;
outb  : OUT BIT;
outfb  : OUT BIT;

END structural_view;
// $Id: aspec.fifochain.fl,v 1.4 1995/09/20 23:26:01 weih Exp weih $  
// Calculating next state f' for the fifo chain  
// Load verification library and arithmetic library  
load "verifil.fl";  
load "arithmetic.fl";  
load "util.fl";  
// Load fl file for description  
load "fifochain.fl";  
// Handy constants  
let CELLS = 7;  
let END = 100;  
let INT = 1;  
let SNK_IND_LEN = 4;  
letrec listof 0 b = []  
  | listof n b = ( b , ( listof ( n - 1) b) );  
// Input variable  
let inp = variable "inp";  
// Functions to initialise sinks and sources  
let init_sink name n =  
  let ind = enumerate ( name "index" ) n  
  in  
  ( ind , inp ( listof n F ) ) from 0 to 2)  
  ( ( name "gate" ) is F ) from 0 to 2);  
let init_src datan calls inv =  
  ( ( datan is F ) from 0 to END )  
  ( ( calls "guard" ) is F ) from 0 to 2)  
  ( ( calls "out" ) is F ) from 0 to 2);  
// This function takes a list of strings naming the fifo calls and  
// initializes them to variables by the same name  
letrec init_fcalls [] s e = []  
  | init_fcalls ( h : r ) s e =  
    ( ( h "out" ) is T from s to e )  
    ( ( h "gfout" ) is T from s to e )  
  ( init_fcalls r s e );  
let fifo_list = [ enumerate "fifo" 1 ];  
let ant = [ init_src "data" "src" inp ]  
  ( init_sink "sink" SNK_IND_LEN )  
  ( init_fcalls fifo_list 0 2 );  
let cons = let mm = END - 1 in let n = END  
  in  
  ( mm ) is inp from mm to n  
  ( init_fcalls fifo_list mm n );  
let tr_list = flat ( nodes FIFCHAIN );  
let verify = STE "FIFCHAIN () ant cons []";  
let plt = STE "-T plot ps" FIFCHAIN [ ant cons  
  ( map ( node , node , 0.500 ) tr_list );
let ex = excitation FIFOCHAIN;
let fo = fanout FIFOCHAIN;
let fl = fanin FIFOCHAIN;
verify;

// S16: spec/add4.fl.w 1.5 1993/09/20 23:30:11 weih Exp weih $  
// Verification script for the async 4 bit adder  
// Load verification library and arithmetic library  
load 'verification.fl';  
load 'arithm.fl';  
load 'HighLowBit.fl';  
load 'util.fl';  
// Load fl file for description  
load 'add4.fl';  

// Handy constants  
let END = 100;  
let SRC_LEN = 4;  
let SNK_IND_LEN = 4;  

letrec listof 0 b = []  
    / listof n b = [b | listof (n-1) b];  

let cin = variable "cin";  
let a0in = variable "a0in";  
let b0in = variable "b0in";  
let alin = variable "alin";  
let blin = variable "blin";  
let a2in = variable "a2in";  
let b2in = variable "b2in";  
let a3in = variable "a3in";  
let b3in = variable "b3in";  

//let cin = P;  
//let a0in = P;  
//let b0in = P;  
//let alin = P;  
//let blin = P;  
//let a2in = P;  
//let b2in = P;  
//let a3in = P;  
//let blin = P;  

// Functions to initialize sinks and sources  
let init_sink name n =  
    let ind = enumerate (name"index["]) n  
        in  
    ((ind isv (listof n P)) from 0 to 2)  
    ((name"gev") is P ) from 0 to 2 );  

let init_src datan inv =  
    ( datan is inv ) from 0 to END )  
    (( cellin"index" ) is F ) from 0 to 2 )  
    (( cellin"padded" ) is F ) from 0 to 2 )  
    (( cellin"coutput" ) is F ) from 0 to 2 );  

let init_sinks =  
    (init_sink 'sink0' SNK_IND_LEN)  
    (init_sink 'sink1' SNK_IND_LEN)  
    (init_sink 'sink2' SNK_IND_LEN)  
    (init_sink 'sink3' SNK_IND_LEN)  
    (init_src 'sink4' SNK_IND_LEN),  

let init_sources =  
    (init_src 'clinst' recsrc cin)  
    (init_src 'distr' recsrc a0in)  
    (init_src 'blastr' recsrc b0in)  
    (init_src 'alistr' recsrc alin)  
    (init_src 'cinst' recsrc recsrc)  

(init_arc "blx" "srcbl" blin) #
(init_arc "a1a" "srcbl" alin) #
(init_arc "bls" "srcbl" blin) #
(init_arc "bls" "srcbl" alin) #
(init_arc "brs" "srcbl" blin) #
(init_arc "brs" "srcbl" alin) #

// This function takes a list of strings naming the fifo cells and
// asserts that they are all empty from time s to time e
letrec int_fcells [] s e \#
"(h'gout') is P from s to e \#
"(h'gout') is P from s to e \#
"(int_fcells r s e)\#
let fifo_list = ['fifo01a', 'fifo02b', 'fifo02b', 'fifo02a',
'fifo03b', 'fifo03a', 'fifo03b', 'fifo03b',
'fifo03a', 'fifo03a', 'fifo02b', 'fifo02a',
'fifo01a', 'fifo01a', 'fifo02b', 'fifo02a']

// Functions to initialize the different units in the adder
// Fifo cells, sinks and sources are taken care of above, so I need
// to deal with: maj3, vjoin, and xor3 units
letrec int_maj3 [] s e [\#
"(int_maj3) is P from s to e \#
"(int_maj3 r s e)\#
letrec int_vjoin [\#
"(int_vjoin r s e)\#
letrec int_xor3 [] s e \#
"(int_xor3 r s e)\#
let maj_list = ['maj0', 'maj1', 'maj2', 'maj3']
let vj_list = ['vjoin0v', 'vjoin0v', 'vjoin0v', 'vjoin0v']
let xor_list = ['xor0', 'xor1', 'xor2', 'xor3']
let ant = init_sinks @ init_sources @
(int_fcells fifo_list 0 2) @ (int_maj3 maj_list 0 2) @
(int_vjoin vj_list 0 2) @ (int_xor3 xor_list 0 2)

// Function for describing the consequent
let conf n =
let cap = [F, alin, alin, alin]
let bob = [F, blin, blin, blin, blin]
let opc = [F, F, F, F, cin]
in
("snk", "int3ex", "data0") is (al (n+1) sum) from (END-1) to END ;
// Consequent says that the sum was correct and that the add has returned to
// an empty state
let cons =
let s = END-1 in let e = END in
(conf0) @ (conf1) @ (conf2) @ (conf3) @ (conf4) @
(int_fcells fifo_list s e) @ (int_maj3 maj_list s e) @
(int_vjoin vj_list s e) @ (int_xor3 xor_list s e)
let tr_list =
['snkdata0', 'snkdata0', 'snkdata0', 'snkdata0', 'snkdata0', 'snkdata0', 'snkdata0', 'snkdata0', 'snkdata0']

let pity = STE "F plot.ps" ADD4 [] ant cons
(map (node (nodes (nodes ADD4))) tr_list);
let vi = [('cin', cin), ('alin', alin), ('bolin', blin),
('alin', alin), ('blin', blin), ('alin', alin), ('bolin', blin),
('alin', alin), ('blin', blin)];
let ex = excitations ADD4;
let fo = fanouts ADD4;
let fl = fanin ADD4;
let bag = mem 'cinstr[]' (flat (nodes ADD4));
let works = mem 'cinstr[]' (flat (nodes ADD4));
let verify = STE "ADD4 [] ant cons [];"
(init_src `blatr" `srcbl` blin) 0
(init_src `alatr" `srcal` alin) 0
(init_src `blatr` `srcbl` blin) 0
(init_src `alatr` `srcal` alin) 0
(init_src `blatr" `srcbl" blin) 0

// Functions to initialise the different units in the adder
// If the cells, sinks and sources are taken care of above, so I need
// to deal with: msaj, vjoin, and xor units

letrec init_maj [] s e = []
  | init_maj (hr:s) s e = (h'gtout) is F from s to e) 0
  | init_maj r s e = 0

letrec init_vjoin [] s e = []
  | init_vjoin (hr:s) s e = (h'gtout) is F from s to e) 0
  | init_vjoin r s e = 0

letrec init_xor3 [] s e = []
  | init_xor3 (hr:s) s e = (h'gtout) is F from s to e) 0
  | init_xor3 r s e = 0

let maj_list = ["msaj", "maj1", "maj2", "maj3"];
lst vj_list = [vjoin0", vjoin1", vjoin2", vjoin3"];
lst xor_list = [xor0", xor1", xor2", xor3"];

let ant = init_sinks @ init_sources @
  (init_maj maj_list 0 2) @
  (init_vjoin vj_list 0 2) @
  (init_xor xor_list 0 2);
let bug = mem 'cinstr3'; (flat (nodes FAD04));
let works = mem 'cinstr2'; (flat (nodes FAD04));
let verify = STF ** FAD04 [] ant cons [];

verify;

(* $Id: mc.ati,v 1.6 1996/04/22 19:07:44 weih Exp weih $ *)
(* these are the basic VH descriptions of the fifo and add cells *)

IMPLEMENTATION MODULE mc;
(* FROM fipspecial IMPORT FInput, FRefinement; *)

STATIC
xor3: BIT0 = BEGIN ( a XOR b XOR c ) END;

xor3: BIT0 = BEGIN ( ( a OR b ) AND ( b OR c ) AND ( a OR c ) ) END;

(* Implementation cells *)

fifoCell : c110Cell =
BEGIN
  < NOT ( isValid( data1 ) = valid ) >=
  gdata0.t, gdata0.f := data1.t, data1.f >=
END;

src : cSrcCell =
STATE
srcDone : BOOLEAN;
BEGIN
  < valid AND NOT sdone ->
  gdata0.t, gdata0.f, sdone := TRUE, TRUE, TRUE >=
END;

sink : cSinkCell =
STATE
sinkRegister : DEC;
sinkRegisterDone : BOOLEAN := FALSE;
BEGIN
  < NOT isValid( data1 ) -> gvalid := FALSE >=
  < NOT sdone AND isValid( data1 ) ->
  register.t, register.f, sdone, gvalid := data1.t, data1.f, TRUE, TRUE >=
END;

xor3: c110Cell =
BEGIN
  < isValid( data1 ) AND isValid( data2 ) AND isValid( data3 ) AND NOT svalid >=
  gdata0.t, gdata0.f :=
  xor3: ( data1.t, data1.f, data2.t, data2.f ) .
  xor3: ( data1.f, data2.f, data3.t, data3.f ) >=
END;

fsm : f3110Cell =
BEGIN
  < NOT ( data10.t OR data11.t ) OR NOT ( data10.f OR data11.f ) >=
END;
gdate0.t, gdate0.f :=
maxj8( data10.t, data11.t, data12.t ),
maxj8( data10.f, data11.f, data12.f ) =>
*)
<< (isValid(data10) AND isValid(data11) AND isValid(data12))
AND NOT gvalid >>
gdate0.t, gdate0.f :=
maxj8( data10.t, data11.t, data12.t ),
maxj8( data10.f, data11.f, data12.f ) =>
<< (isValid(data10) AND isValid(data11) AND isValid(data12))
(* AND NOT gvalid *) AND isValid(gdate0) =>
gvalid := TRUE >>
<< NOT ( isValid(data10) OR isValid(data11) OR isValid(data12) )
AND gvalid => gdate0.t, gdate0.f, gvalid := TRUE, TRUE, FALSE >>
(*
<< data0.t, data0.f, valid := gdate0.t, gdate0.f, gvalid >>
<< data0.t, data0.f := gdate0.t, gdate0.f >>
<< valid := gvalid >>
END;

field := fieldCell1 =
BEGIN
(* C-element *)
<< NOT ( vx XOR vm ) => vx0th := vx >>
<< xor3( data10, data11, data12, vx, data00, valid0, gdate00 )
masj3( data10, data11, data12, vx, data01, valid1, gdate01, gvalid )
END;

END.
ENTITY src IS
  PORT (
    ack : IN BIT;
    out : OUT BIT;
    data : IN BIT;
    init : IN BIT;
    rep : IN BIT;
  );
END src;

ARCHITECTURE data_flow OF src IS
  SIGNAL gData : REG_VECTOR (7 DOWNTO 0) REGISTER;
  SIGNAL index : REG_VECTOR (3 DOWNTO 0) REGISTER;
BEGIN
  init : BLOCK ( init )
  BEGIN
    gData <= GUARDED gData;
    END BLOCK;
  -- Transition to supply empties when next cell has data
  clr : BLOCK ( ack )
  BEGIN
    gOut <= GUARDED "0";
    gOut <= GUARDED "0";
    END BLOCK clr;
  -- Stuff for dealing with supplying data
  dref0 : BLOCK ( | rep OR NOT done | AND NOT ack AND NOT init )
  BEGIN
    gOut <= GUARDED data;
    gOut <= GUARDED NOT data;
    done <= GUARDED '1';
  END BLOCK;
  -- Counter goes to 0 and then stops supplying data
  dref1 : BLOCK ( NOT (ack OR init OR gOut OR gOut ) ) AND NOT index(3)
  AND NOT index(2) AND NOT index(1) AND index(0) 
  BEGIN
    gOut <= GUARDED gData(0);
    gOut <= GUARDED NOT gData(0);
    index ( 3 DOWNTO 0 ) <= GUARDED '0000';
    END BLOCK;
    -- index = '0010'
  dref2 : BLOCK ( NOT (ack OR init OR gOut OR gOut ) ) AND NOT index(3)
  AND NOT index(2) AND NOT index(1) AND NOT index(0) 
  BEGIN
    gOut <= GUARDED gData(1);
    gOut <= GUARDED NOT gData(1);
    index ( 3 DOWNTO 0 ) <= GUARDED '0001';
    END BLOCK;
    -- index = '0011'
  dref3 : BLOCK ( NOT (ack OR init OR gOut OR gOut ) ) AND NOT index(3)
  AND NOT index(2) AND index(1) AND index(0) 
  BEGIN

entity latch is
port (int: in bit; inf: in bit; outT: in/out bit; outf: in/out bit; ackI: out bit; init: in bit)
end latch;
architecture data_flow of latch is
signal gout: reg bit register;
signal gTout: reg bit register;
begi
-- transition for passing empties
empty: block (not (int or inf) and acki and not init)
begin
  gTout <= guarded '0';
gout <= '0';
end block;

-- transition for passing data
tran: block (not acki and (int or inf) and not init)
begin
  gTout <= guarded int;
gout <= guarded inf;
end block tran;
outT <= gTout;
outF <= gout;
ackI <= gTout or gout;
end data_flow;

-- gTout <= guarded gData(1);
gout <= guarded not gData(2);
index(3 downto 0) <= guarded '0110';
end block;

-- index = '0100'
dref4: block (not (ack or init or gTout or gout)) and not index(3)
and index(2) and not index(1) and not index(0)
begin
  gTout <= guarded gData(3);
gout <= guarded not gData(3);
index(3 downto 0) <= guarded '0011';
end block;

-- index = '0101'
dref5: block (not (ack or init or gTout or gout)) and not index(3)
and index(2) and not index(1) and not index(0)
begin
  gTout <= guarded gData(4);
gout <= guarded not gData(4);
index(3 downto 0) <= guarded '0100';
end block;

-- index = '0110'
dref6: block (not (ack or init or gTout or gout)) and not index(3)
and index(2) and not index(1) and not index(0)
begin
  gTout <= guarded gData(5);
gout <= guarded not gData(5);
index(3 downto 0) <= guarded '0110';
end block;

-- index = '0111'
dref7: block (not (ack or init or gTout or gout)) and not index(3)
and index(2) and index(1) and index(0)
begin
  gTout <= guarded gData(6);
gout <= guarded not gData(6);
index(3 downto 0) <= guarded '0111';
end block;

-- index = '1000'
dref8: block (not (ack or init or gTout or gout)) and index(3)
and not index(2) and not index(1) and not index(0)
begin
  gTout <= guarded gData(7);
gout <= guarded not gData(7);
index(3 downto 0) <= guarded '0111';
end block;

outT <= gTout;
outF <= gout;
end data_flow;
-- $Id: switch_vbe.v 1.6 1995/11/18 00:32:16 with Exp $
-- This is a behavioural description of the asymmetric asynchronous switch
-- for multi-ring circuits.
-- This passes A and B straight through when NOT C, and passes B out
-- A's output (ignoring A) when C

ENTITY switch IS
  PORT (
    st  : IN BIT;
    sf  : IN BIT;
    sB  : OUT BIT;
    bT  : IN BIT;
    BF  : IN BIT;
    ACKA : OUT BIT;
    hT  : IN BIT;
    CF  : IN BIT;
    Ackc : OUT BIT;
    soT : INOUT BIT;
    soF : INOUT BIT;
    AckAO : IN BIT;
    Bot : INOUT BIT;
    boF : INOUT BIT;
    AckBO : IN BIT;
  );
END switch;

ARCHITECTURE data_flow OF switch IS
  BEGIN
    SIGNAL gAt, gAf, gBoT, gBof, gackA, gackB, lastC : REG_BIT REGISTER;
    BEGIN
      -- pass empty (with lastC T)
      passT : BLOCK ( NOT ( bT OR BF ) AND NOT ( CT OR CF ) AND ACKAO AND lastC )
        BEGIN
          gAt  <= GUARDED '0';
          gAf  <= GUARDED '0';
          gackB <= GUARDED '0';
        END BLOCK;
      -- pass empty (with lastC F)
      passF : BLOCK ( NOT ( AT OR AF ) AND NOT ( bt OR BF ) AND NOT ( CT OR CF )
        AND ACKAO AND ackAO AND NOT lastC )
        BEGIN
          gAt  <= GUARDED '0';
          gAf  <= GUARDED '0';
          gackA <= GUARDED '0';
          gBoT <= GUARDED '0';
          gBof <= GUARDED '0';
          gackB <= GUARDED '0';
        END BLOCK;
      -- C True
      CT : BLOCK ( ( bT OR BF ) AND CT AND NOT AckAO )
        BEGIN
          gAt  <= GUARDED bT;
          gAf  <= GUARDED BF;
          gackA <= GUARDED '1';
          lastC <= GUARDED '1';
        END BLOCK;
      -- C False
ENTITY fork IS
PORT ( inf : IN BIT; ackin : OUT BIT; of : OUT BIT; ack01 : IN BIT; of : OUT BIT; o2f : OUT BIT; ack02 : IN BIT; )
END fork;
ARCHITECTURE data_flow OF fork IS
SIGNAL gackt : REG_BIT REGISTER;
BEGIN
oif <- inf;
o2f <- inf;
go <- BLOCK ( NOT ( ack01 XOR ack02 ) ) BEGIN
gackt <= GUARDED ack01;
END BLOCK go;
ackin <= gackt;
END data_flow;

ENTITY mux IS
PORT ( at : IN BIT; af : IN BIT; ackA : OUT BIT; of : IN BIT; acKB : OUT BIT; of : IN BIT; ackC : OUT BIT; outF : INOUT BIT; ackOut : IN BIT;
)
END mux;
ARCHITECTURE data_flow OF mux IS
SIGNAL got, goF, gackA, gackB, lastC : REG_BIT REGISTER;
BEGIN
-- pass empties (lastC T)
past : BLOCK ( NOT ( at OR af ) AND NOT ( ct OR CF ) AND ackOut AND lastC ) BEGIN
got <= GUARDED '0';
goF <= GUARDED '0';
gackA <= GUARDED '0';
END BLOCK;
-- pass empties (lastC F)
pastF : BLOCK ( NOT ( df OR of ) AND NOT ( ct OR CF ) AND ackOut AND NOT lastC ) BEGIN
got <= GUARDED '0';
goF <= GUARDED '0';
gackB <= GUARDED '0';
END BLOCK;
-- C True ct : BLOCK ( at OR af ) AND ct AND NOT ackOut ) BEGIN
got <= GUARDED at;
goF <= GUARDED af;
gackA <= GUARDED '1';
lastC <= GUARDED '1';
END BLOCK;
-- C False cf : BLOCK ( df OR of ) AND cf AND NOT ackOut ) BEGIN
got <= GUARDED df;
goF <= GUARDED of;
gackB <= GUARDED '0';
lastC <= GUARDED '0';
END BLOCK;
entity f1 is
port (  
in0 : in bit;
in1 : in bit;
ack0 : out bit;
ack1 : out bit;
in2 : in bit;
in3 : in bit;
ack0 : out bit;
ack1 : out bit;
ackcl : in bit  
);
end f1;
archtiecture data_flow of f1 is
begin
-- Transition for passing empty
empty : block ( not ( in0 or in1 or in2 or in3 )  
and ackcl )
begin
qcout <= guarded "0";
qcout <= guarded "0";
end block;
-- Transition for calculating data
in : block ( ( in0 or in1 ) and ( in2 or in3 )  
and not ackcl )
begin
qcout <= guarded ( ( not in0 ) and in1 )
or ( ( not in1 ) and ( not in2 ) );
qcout <= guarded ( not ( ( not in0 ) and in1 )  
or ( ( not in1 ) and ( not in2 ) ) );
end block;

clt <= qcout;
cif <= qcout;
ack0 <= clt or cif;
ack1 <= clt or cif;
ack2 <= clt or cif;
end data_flow;
ENTITY fadder IS
PORT (  
sT : IN BIT;  
sF : IN BIT;  
ackA : OUT BIT;  
bF : IN BIT;  
ackB : OUT BIT;  
t : IN BIT;  
cF : IN BIT;  
ackC : OUT BIT;  
ct : IN BIT;  
s : IN BIT;  
ckT : OUT BIT;  
coF : OUT BIT;  
ackCo : IN BIT  
);  
END fadder;
ARCHITECTURE data_flow OF fadder IS
SIGNAL gTout, gcTout, gStout, gSFout : REG_BIT REGISTER;
BEGIN
-- Transition for passing emties
empty : BLOCK ( NOT ( sT OR sF OR bt OR BF OR CT OR CF ) AND ackS AND ackCo) BEGIN
  gStout <= GUARED "0";
  gSFout <= GUARED "0";
  gcTout <= GUARED "0";
  gSFout <= GUARED "0";
END BLOCK;
-- Transition for calculating data
fn : BLOCK ( ( sT OR sF OR BT OR BF OR CT OR CF ) AND NOT ackS 
and NOT ackCo ) BEGIN
  gStout <= GUARED ( sT XOR BT XOR CT );
  gSFout <= GUARED ( sF XOR BF XOR CF );
  gcTout <= GUARED ( ( aT AND BT ) OR ( aT AND CT ) OR ( BT AND CT ) );
  gSFout <= GUARED ( ( aF AND BF ) OR ( aF AND CF ) OR ( BF AND CF ) );
END BLOCK;
  sT <= gStout;
  sF <= gSFout;
  ct <= gcTout;
  coF <= gSFout;
  ackA <= sT OR sF;
  ackB <= sT OR sF;
  ackC <= sT OR sF;
END_data_flow;
END fadder;

-- This is the behavioural description of an 'adder' functional unit for 
-- multi-ring circuits

ENTITY fadder IS
PORT (  
inT : IN BIT;  
inF : IN BIT;  
ackIn : OUT BIT;  
init : IN BIT;  
inF : IN BIT;  
ackIn : OUT BIT;  
outT : INOUT BIT;  
outF : INOUT BIT;  
ackOut : IN BIT  
);
END fadder;
ARCHITECTURE data_flow OF fadder IS
SIGNAL gTout : REG_BIT REGISTER;
SIGNAL gFout : REG_BIT REGISTER;
BEGIN
-- Transition for passing emties
empty : BLOCK ( NOT ( inT OR inF OR init OR inInit ) ) BEGIN
  gTout <= GUARED "0";
  gFout <= GUARED "0";
END BLOCK;
-- Transition for calculating data
fn : BLOCK ( ( inT OR inF ) AND ( init OR inInit ) ) BEGIN
  gTout <= GUARED ( inT AND init );
  gFout <= GUARED ( inF AND inInit ) OR ( inF AND init ) OR ( inF AND inInit )
END BLOCK;
  outT <= gTout;
  outF <= gFout;
  ackIn <= ackOut;
  ackIn <= ackOut;
END_data_flow;
-- This is the behavioural description of the functional unit for
-- the function f2 for the vector multiplier control block
-- (for multi-ring circuits)

ENTITY f2 IS
PORT :
in0 : IN BIT;
in1 : IN BIT;
ackin0 : OUT BIT;
in2 : IN BIT;
in3 : IN BIT;
ackin1 : OUT BIT;
in4 : IN BIT;
ackin2 : OUT BIT;
cit : INOUT BIT;
cif : INOUT BIT;
ackc4 : IN BIT;

END f2;

ARCHITECTURE data_flow OF f2 IS
BEGIN
-- Transition for passing emites
empty : BLOCK ( NOT ( in0 OR in1 OR in2 OR in3 OR in4 )
AND ackin2 )
BEGIN
gcTout <= GUARDED "0";
gcFout <= GUARDED "0";
END BLOCK;

-- Transition for calculating data
fn : BLOCK ( ( in0 OR in2 ) AND ( in1 OR in3 )
AND NOT ackin2 )
BEGIN
gcTout <= GUARDED NOT ( in0 AND in1 );
gcFout <= GUARDED ( in0 AND in1 );
END BLOCK;
cit <= gcTout;
cif <= gcFout;
ackin0 <= cit OR cif;
ackin1 <= cit OR cif;
ackin2 <= cit OR cif;
END data_flow;

-- $Id: picebit_vst.v 1.1 1996/06/07 19:54:49 weih Exp $
-- Description of a bit of the PISO structure
-- from paper on Wring circuits

ENTITY picebit IS
PORT (;
'1' data in
p1 : IN BIT;
p2 : IN BIT;
ackpl : OUT BIT;
-- Serial data in
sit : IN BIT;
si1 : IN BIT;
acksl : OUT BIT;
-- Serial data out
so1 : OUT BIT;
sof : OUT BIT;
ackso : IN BIT;
-- Control in
cit : IN BIT;
cif : IN BIT;
ackc4 : IN BIT;
-- Control out
cout : OUT BIT;
cof : OUT BIT;
ackoo : IN BIT;
-- Verification signals
init : IN BIT
);
END picebit;

ARCHITECTURE structural_view OF picebit IS
BEGIN
GENERIC :

END COMPONENT;

COMPONENT switch
PORT (;
st : IN BIT;
sf : IN BIT;
acka : OUT BIT;
bt : IN BIT;
bf : IN BIT;
ackb : OUT BIT;
cit : IN BIT;
cif : IN BIT;
ackc : OUT BIT;
sot : OUT BIT;
sof : OUT BIT;
ackao : IN BIT;
bof : OUT BIT;
ackbo : IN BIT
);
END switch;
END COMPONENT:

COMPONENT fork
PORT {
  int  : IN BIT;
  inf  : IN BIT;
  ackln : OUT BIT;
  cif  : OUT BIT;
  c1f  : OUT BIT;
  c1o1 : IN BIT;
  c1f : OUT BIT;
  c1f : OUT BIT;
  ack2 : IN BIT;
}

END COMPONENT;

COMPONENT rank
PORT {
  ackl  : OUT BIT;
  int  : IN BIT;
  inf  : IN BIT;
  init : IN BIT;
}

END COMPONENT;

SIGNAL S1T, S1F, acksl, S2T, S2F, acks2, Snk1, SnkF, ackSnk : BIT;
SIGNAL C1T, C1F, ackCI, CAF, ackCA : BIT;

BEGIN

swl : switch
PORT MAP {
  ST  => S1T,
  SF  => S1F,
  ackA => acksl,
  CT  => C1T,
  CF  => CAF,
  ackC => ackCA,
  soT  => S2T,
  soF  => S2F,
  ackAO => acks2,
  boT  => Snk1,
  boF  => SnkF,
  ackBO => ackSnk
};

cflk : fork
PORT MAP {
  int  => C1T,
  inf  => C1F,
  ackln => ackCI,
  c1T  => CAT,
  c1f  => CAF,
  ack01 => ackCA,
  o1T  => COT,
  o1F  => COF,
  ack02 => ackCo
};
snk0 : rank
PORT MAP {
  int  => Snk1,
  inf  => SnkF,
  ackl  => ackSnk,
  init => init
};

11 : slatch
PORT MAP {
  int  => S1T,
  inf  => S1F,
  ackl  => ackS1,
  c1T  => S1T,
  c1f  => S1F,
  ack0 => AckS2,
  init => init
};

12 : slatch
PORT MAP {
  int  => S2T,
  inf  => S2F,
  ackl  => ackS2,
  c1T  => S2T,
  c1f  => S2F,
  ack0 => AckS0,
  init => init
};

13 : slatch
PORT MAP {
  int  => C1T,
  inf  => C1F,
  ackl  => ackCI,
  c1T  => C1T,
  c1f  => C1F,
  ack0 => AckCI,
  init => init
};

END structural view;
package des description
d -- 1996/08/07 19:55:35 weih Exp $ 
-- Description of the control block of the vector multiplier
-- from paper on Mising circuits

ENTITY control IS
  PORT ( 
    last : IN BIT;
    lastf : IN BIT;
    acklast : OUT BIT;
    epf : IN BIT;
    ackep : OUT BIT;
    clot : OUT BIT;
    clof : OUT BIT;
    ackclo : IN BIT;
    initi : IN BIT;
    vds : IN BIT;
  );
END control;

ARCHITECTURE structural_view OF control IS
  COMPONENT alatch
    PORT ( 
      int : IN BIT;
      inp : IN BIT;
      outp : OUT BIT;
      outp : OUT BIT;
      ack : OUT BIT;
      init : IN BIT;
    );
  END COMPONENT;

  COMPONENT async
    PORT ( 
      at : IN BIT;
      af : IN BIT;
      acka : OUT BIT;
      b : IN BIT;
      b : IN BIT;
      ackb : OUT BIT;
      ct : IN BIT;
      c : IN BIT;
      ackc : OUT BIT;
      sot : OUT BIT;
      sof : OUT BIT;
      ackd : IN BIT;
      bot : OUT BIT;
      bto : OUT BIT;
      j : IN BIT;
    );
  END COMPONENT;

  COMPONENT src
    PORT ( 
      ack : IN BIT;
      out : OUT BIT;
      out : OUT BIT;
    );
  END COMPONENT;
END structural_view;

data : IN BIT;
init : IN BIT;
rep : IN BIT;

END COMPONENT;

COMPONENT reink
PORT
ack : OUT BIT;
inf : IN BIT;
init : IN BIT;

END COMPONENT;

COMPONENT fork
PORT
int : IN BIT;
in1 : IN BIT;
ack : OUT BIT;
o1 : OUT BIT;
ack1 : IN BIT;
o2 : OUT BIT;
ack2 : IN BIT;

END COMPONENT;

COMPONENT mux
PORT
st : IN BIT;
a : IN BIT;
ack : OUT BIT;
b : IN BIT;
out : OUT BIT;
ackout : IN BIT;

END COMPONENT;

COMPONENT f1
PORT
int : IN BIT;
in1 : IN BIT;
ack1 : OUT BIT;
in1 : IN BIT;
in1 : IN BIT;
ack1 : OUT BIT;
in2 : IN BIT;
in2 : IN BIT;
ack1 : IN BIT;

END COMPONENT;

COMPONENT f2
PORT
int : IN BIT;
in1 : IN BIT;
ack1 : OUT BIT;
in1 : IN BIT;
in1 : IN BIT;
in1 : OUT BIT;
ack1 : OUT BIT;
ack1 : OUT BIT;
ack1 : IN BIT;

END COMPONENT;

COMPONENT swi
PORT MAP (
st => int;
a => int1;
ack => ack1;
ob => out;
acko => out1;

END COMPONENT;

COMPONENT mux
PORT MAP
st => int;
a => int1;
ack => ack1;
ob => out;
acko => out1;

END COMPONENT;
```c
ackIn => ackIn0,
outT => outT0,
outF => outF0,
ack01 => ack01a,
outT => outT1,
outF => outF1,
ack02 => ack02b
}

inlfk : fork
PORT MAP
int => in1T,
inf => in1F,
ackin => ack1In,
outT => out1T,
outF => out1F,
ack101 => ack1In01,
outT => out1T0,
outF => out1F0,
ack102 => ack1In02
}

inlfk : fork
PORT MAP
int => In2T,
inf => In2F,
ackin => ack2In,
outT => out2T,
outF => out2F,
ack201 => ack2In01,
outT => out2T0,
outF => out2F0,
ack202 => ack2In02
}

srcb : src
PORT MAP
outT => src0T,
outF => src0F,
ack => acksrc0,
data => vas,
init => init,
rep => vds
}

snkb : sink
PORT MAP
int => snkT0,
inf => snkF0,
ack1 => acksnk1,
init => init
}

ll : alert
PORT MAP
int => ln0T,
inf => ln0F,
ack1 => ackln01,
outT => out1T,
outF => out1F,
ack0 => ackln01,
init => init
```
12: slatch
PORT MAP(
  inT => InitT,
  inf => InitF,
  ackI => ackIni1a,
  outF => InitF,
  ack0 => ackIni1,
  init => init)
)

13: slatch
PORT MAP(
  inT => InitT,
  inf => InitF,
  ackI => ackIni2,
  outF => InitF,
  ack0 => ackIni3,
  init => init)
)

14: slatch
PORT MAP(
  inT => InitT,
  inf => InitF,
  ackI => ackIni4,
  outF => InitF,
  ack0 => ackIni2,
  init => init)
)

15: slatch
PORT MAP(
  inT => ZpsT,
  inf => ZpsF,
  ackI => ackZp,
  outF => InitF,
  ack0 => ackIni2,
  init => init)
)

16: slatch
PORT MAP(
  inT => FpsT,
  inf => FpsF,
  ackI => ackFp1,
  outF => InitF,
  ack0 => ackFp2,
  init => init)
)

17: slatch
PORT MAP(
  inT => PpsT,
  inf => PpsF,
  ackI => ackPs2,
  outF => InitF,
  ack0 => ackPs2,
  init => init)
)
ACKI \rightarrow \text{ackStr}(0),
\text{outT} \rightarrow \text{In0T},
\text{outF} \rightarrow \text{In0F},
\text{ack0} \rightarrow \text{ackIn0},
\text{init} \rightarrow \text{init};

\text{ful : f1}
\text{PORT MAP (}
\text{in0T} \rightarrow \text{In0T},
\text{in0F} \rightarrow \text{In0F},
\text{ackIn0} \rightarrow \text{ackIn0},
\text{init} \rightarrow \text{init},
\text{in1F} \rightarrow \text{In1F},
\text{ackIn1} \rightarrow \text{ackIn1},
\text{init} \rightarrow \text{init},
\text{in2F} \rightarrow \text{In2F},
\text{ackIn2} \rightarrow \text{ackIn2},
\text{CIT} \rightarrow \text{Fit},
\text{CIF} \rightarrow \text{FIF},
\text{ackCl} \rightarrow \text{ackF1};

\text{ful : f3}
\text{PORT MAP (}
\text{in0T} \rightarrow \text{In0T},
\text{in0F} \rightarrow \text{In0F},
\text{ackIn0} \rightarrow \text{ackIn0},
\text{init} \rightarrow \text{init},
\text{in1F} \rightarrow \text{In1F},
\text{ackIn1} \rightarrow \text{ackIn1},
\text{init} \rightarrow \text{init},
\text{in2F} \rightarrow \text{In2F},
\text{ackIn2} \rightarrow \text{ackIn2},
\text{CIT} \rightarrow \text{Fit},
\text{CIF} \rightarrow \text{FIF},
\text{ackCl} \rightarrow \text{ackF2};
\)
\text{END structural_view;}

-- Description of a bit slice for the MAS block of the vector multiplier
-- from paper on WIRing circuits

ENTITY masbit IS
PORT (  
  -- F1 (out)
  FIT : OUT BIT;
  FIF : OUT BIT;
  ackF1 : IN BIT;
  -- F1-1 (in)
  PinT : IN BIT;
  PinF : IN BIT;
  ackFin : OUT BIT;
  -- Parallel op (in)
  PogT : IN BIT;
  PogF : IN BIT;
  ackPop : OUT BIT;
  -- Control 1 (in)
  ClIT : IN BIT;
  ClIF : IN BIT;
  ackCl : OUT BIT;
  -- Control 1 (out)
  ClOf : OUT BIT;
  ackClO : IN BIT;
  -- Control 2 (in)
  C2IT : IN BIT;
  C2IF : IN BIT;
  ackC2l : OUT BIT;
  -- Control 2 (out)
  ClOf : OUT BIT;
  C2Of : OUT BIT;
  ackC2O : IN BIT;
  -- Sum for iteration k (in)
  SkIT : IN BIT;
  SkIF : IN BIT;
  ackSki : OUT BIT;
  -- Sum for iteration k (out)
  S0kT : OUT BIT;
  S0kF : OUT BIT;
  ackSko : IN BIT;
  -- Carry 1 (out)
  CyIT : OUT BIT;
  CyIF : OUT BIT;
  ackCyI : IN BIT;
  -- Carry 1-1 (in)
  CyIT : IN BIT;
  CyITF : OUT BIT;
  ackCyIT : OUT BIT;
  -- Result 1 (out)
  ResT : OUT BIT;
  ResF : OUT BIT;
  ackRes : IN BIT;
  -- Verification signals
  init : IN BIT;
  -- Constants
  vdd : IN BIT;
  vss : IN BIT;
);
END masbit;
sof => PolF,
ackA0 => ackPolF,
boF => skIT,
boR => skITi.
ackB0 => ackSmk1
}

sw2 : switch
    PORT MAP (
        AT => srcOT,
        AF => srcOF,
        acRA => acksrc0,
        DF => srcT,
        BF => srcF,
        ackB => ackS2,
        cT => c2AT,
        CF => c2Af
    );
ackC => ackC2a,
afT => bit,
afF => bit,
ackA0 => ackB1,
boT => Rest,
boR => Resti.
ackB0 => ackBesi
)

clk : fork
    PORT MAP (
        int => C1IT,
        inf => C1IF,
        ackIN => ackClI,
        oIT => C1st,
        oIF => C1ft,
        ackO1 => ackC1s,
        o2T => C2ot,
        o2F => C2of,
        ackO2 => ackC2o
    );

clk2 : fork
    PORT MAP (
        int => C2IT,
        inf => C2IF,
        ackIN => ackC2I,
        oIT => C2st,
        oIF => C2ft,
        ackO1 => ackC2s,
        c2T => C2cT,
        c2F => C2cf,
        ackO2 => ackC2o
    );

sklk : fork
    PORT MAP (
        int => skIT,
        inf => skIF,
        ackIN => ackSki,
        oIT => skITi,
        oIF => skIF,
        ackO1 => ackSki,
        o2T => skOT,
        o2F => skOF,
        ackO2 => ackSk0
    );

poF : fork
    PORT MAP (
        int => PoIT,
        inf => PoIF,
        ackIN => ackPoA,
        oIT => PoAT,
        oIF => PoAf,
        ack01 => ackPoA1,
        o2T => Po2T,
        o2F => Po2F,
        ack02 => ackPo2b
    );

src : src
    PORT MAP (
        outT => srcOT,
        outF => srcOF,
        ack => acksrc0,
        data => vas,
        init => init,
        rep => vdd
    );

smk : sink
    PORT MAP (
        int => skIT,
        inf => skIF,
        ackI => ackSmk1,
        init => init
    );

and0 : fund
    PORT MAP (
        in0T => PoAT,
        in0F => PoAF,
        ackin0 => ackPoA0,
        init => sk2T,
        in1F => skOF,
        ackin1 => ackSk1,
        outT => a1F,
        outF => a1Fi,
        ackOut => ackA1
    );

dadder0 : fadder
    PORT MAP (
        at => a2F,
        af => a2F,
        ackA => ackA2,
        bt => bit,
        bf => bF,
        ackB => ackB1,
        ct => c2T,
        cf => c2F,
        ackC => ackC2Y,
        st => SI,
        sf => SIP,
        ackS => ackS1,
        cot => c2Yt
    );
END structural_view;

-- $Id: masblock.v 1 1996/04/07 19:56:20 with Exp $  
-- Module describing an 8 bit HAS block as per the M$ing paper

ENTITY masblock IS
  PORT ( 
    P0P : IN BIT_VECTOR (7 DOWNTO 0); 
    P0P : IN BIT_VECTOR (7 DOWNTO 0); 
    ackP0P : OUT BIT_VECTOR (7 DOWNTO 0); 
    C1IT : IN BIT; 
    C1IF : IN BIT; 
    ackC1I : OUT BIT; 
    C1IT : IN BIT; 
    C1IF : IN BIT; 
    ackC1I : OUT BIT; 
    SkIT : IN BIT; 
    SkIF : IN BIT; 
    ackSkI : OUT BIT; 
    Resp : OUT BIT_VECTOR (7 DOWNTO 0); 
    Resp : OUT BIT_VECTOR (7 DOWNTO 0); 
    ackResp : IN BIT_VECTOR (7 DOWNTO 0); 
    Init : IN BIT; 
    VDD : IN BIT; 
    VSS : IN BIT; 
  );
END masblock;

ARCHITECTURE structural_view OF masblock IS
  COMPONENT masbit
    PORT ( 
      P(i) (out) 
      P(i) : OUT BIT; 
      P(i) : IN BIT; 
      ackP(i) : IN BIT; 
      -- P(i) (in) 
      P(i) : IN BIT; 
      P(i) : IN BIT; 
      ackP(i) : OUT BIT; 
      -- Parallel op (in) 
      P0P : IN BIT; 
      P0P : IN BIT; 
      ackP0P : OUT BIT; 
      -- Control 1 (in) 
      C1IT : IN BIT; 
      C1IF : IN BIT; 
      ackC1I : OUT BIT; 
      -- Control 1 (out) 
      C1OT : OUT BIT; 
      C1OF : OUT BIT; 
      ackC1O : IN BIT; 
      -- Control 2 (in) 
      C2IT : IN BIT; 
      C2IF : IN BIT; 
      ackC2I : OUT BIT; 
      -- Control 2 (out) 
      C2OT : OUT BIT; 
      C2OF : OUT BIT; 
      ackC2O : IN BIT; 
      -- Sum for iteration k (in) 
      S1IT : IN BIT; 
      S1IF : IN BIT; 
      ackS1I : OUT BIT; 
      -- Sum for iteration k (out) 
    );
END masbit;
Sk0T : OUT BIT;
Sk0F : OUT BIT;
-- Carry i (out)
Cyi : OUT BIT;
CyiF : OUT BIT;
ackCyi : IN BIT;
-- Carry i-1 (in)
CyinF : IN BIT;
-- Result i (out)
ResT : OUT BIT;
ResF : OUT BIT;
ackRes : IN BIT;
-- Verification signals
init : IN BIT;
-- Constants
vdd : IN BIT;
vas : IN BIT;
END COMPONENT;

COMPONENT src
PORT {
ack : IN BIT;
outT : OUT BIT;
outF : OUT BIT;
data : IN BIT;
init : IN BIT;
}
END COMPONENT:

COMPONENT nsink
PORT {
acki : OUT BIT;
inT : IN BIT;
inF : IN BIT;
init : IN BIT;
}
END COMPONENT:

SIGNAL PiT, PiF, ackPi : BIT_VECTOR (8 DOWNTO 0);
SIGNAL fsrcT, fsrcF, ackFsrc : BIT_VECTOR (7 DOWNTO 4);
SIGNAL SiT, SiF, ackSi : BIT_VECTOR (7 DOWNTO 0);
SIGNAL S0T, S0F, ackS0 : BIT_VECTOR (7 DOWNTO 0);
SIGNAL PiSnkT, PiSnkF, ackPiSnk, CySnkT, CySnkF, ackCySnk : BIT;
BEGIN
-- Somewhere for the control signals and Sk to go after masbit0

snkC1 : sink
PORT MAP {
int => CiTi(0),
inF => C1Fi(0),
acki => ackC1i(0),
init => init
};

snkEk : sink
PORT MAP {
int => S0T(0),
inF => S0F(0),
acki => ackEk(0),
init => init
};

-- Supply of F's for the Pi and Cyi for masbit0

srcPi : src
PORT MAP {
outT => PiT(0),
outF => PiF(0),
ack => ackPi(0),
data => vas,
init => init,
rep => vdd
};

srcCyim : src
PORT MAP {
outT => Cyi(0),
outF => CyiF(0),
ack => ackCy(0),
data => vas,
init => init,
rep => vdd
};

-- Sinks for Pi and Cyi coming out of masbit7

snkPi : sink
PORT MAP {
int => PiSnkT,
inF => PiSnkF,
acki => ackPiSnk,
init => init
};

snkCyim : sink
PORT MAP {
int => Cyi(0),
inF => CyiF(0),
acki => ackCy(0),
init => init
};

-- Supply of F's for the Pops after 1/2 way point

srcPop4 : src
PORT MAP {
outT => fsrcT(4),
outF => fsrcF(4),
ENTITY vm IS
PORT (  
PcGT : IN BIT_VECTOR(3 DOWNTO 0);  
PcF : IN BIT_VECTOR(3 DOWNTO 0);  
ackPOP : OUT BIT_VECTOR(3 DOWNTO 0);  
ClGT : IN BIT;  
ClF : IN BIT;  
ackClGT : OUT BIT;  
ClIT : IN BIT;  
ClIF : IN BIT;  
ackClIF : OUT BIT;  
SkGT : IN BIT;  
SkF : IN BIT;  
ackSkF : OUT BIT;  
Rest : OUT BIT_VECTOR(7 DOWNTO 0);  
ResF : OUT BIT_VECTOR(7 DOWNTO 0);  
init : IN BIT;  
vdd : IN BIT;  
vss : IN BIT);  
END vm;
ARCHITECTURE structural_view OF vm IS
COMPONENT masblock
PORT (  
PcGT : IN BIT_VECTOR(3 DOWNTO 0);  
PcF : IN BIT_VECTOR(3 DOWNTO 0);  
ackPOP : OUT BIT_VECTOR(3 DOWNTO 0);  
ClGT : IN BIT;  
ClF : IN BIT;  
ackClGT : OUT BIT;  
ClIT : IN BIT;  
ClIF : IN BIT;  
ackClIF : OUT BIT;  
SkGT : IN BIT;  
SkF : IN BIT;  
ackSkF : OUT BIT;  
Rest : OUT BIT_VECTOR(7 DOWNTO 0);  
ResF : OUT BIT_VECTOR(7 DOWNTO 0);  
init : IN BIT;  
vdd : IN BIT;  
vss : IN BIT);  
END COMPONENT;
COMPONENT piso4
PORT (  
PIT : IN BIT_VECTOR(3 DOWNTO 0);  
PIF : IN BIT_VECTOR(3 DOWNTO 0);  
ackPIF : OUT BIT_VECTOR(3 DOWNTO 0);  
SoGT : OUT BIT;  
SoF : OUT BIT;  
ackSo : IN BIT;  
ClIT : IN BIT;  
ClIF : IN BIT;  
ackClIF : OUT BIT;  
init : IN BIT;

-- Std: testvm.vat.v 1.2 1995/12/0 00:09:58 with Exp 8
-- Test module for the vector multiplier

ENTITY testvm IS
PORT ( -- Pop : IN BIT_VECTOR (3 DOWNTO 0);
PopF : IN BIT_VECTOR (3 DOWNTO 0);
ackPop : OUT BIT_VECTOR (3 DOWNTO 0);
SopF : IN BIT_VECTOR (3 DOWNTO 0);
ackSop : OUT BIT_VECTOR (3 DOWNTO 0);
ZpF : IN BIT_VECTOR (3 DOWNTO 0);
ackzp : OUT BIT_VECTOR (3 DOWNTO 0);
Last : IN BIT;
LastF : OUT BIT;
ackLast : OUT BIT;
Rest : OUT BIT_VECTOR (7 DOWNTO 0);
ResF : OUT BIT_VECTOR (7 DOWNTO 0);
ackRes : IN BIT_VECTOR (7 DOWNTO 0);
init : IN BIT;
VDD : IN BIT;
VSS : IN BIT); -- ENd Testvm;

ARCHITECTURE structural_view OF testvm IS
COMPONENT va
PORT ( -- Pop : IN BIT_VECTOR (3 DOWNTO 0);
PopF : IN BIT_VECTOR (3 DOWNTO 0);
ackPop : OUT BIT_VECTOR (3 DOWNTO 0);
SopF : IN BIT_VECTOR (3 DOWNTO 0);
ackSop : OUT BIT_VECTOR (3 DOWNTO 0);
ZpF : IN BIT_VECTOR (3 DOWNTO 0);
ackzp : OUT BIT_VECTOR (3 DOWNTO 0);
Last : IN BIT;
LastF : IN BIT;
ackLast : OUT BIT;
Rest : OUT BIT_VECTOR (7 DOWNTO 0);
ResF : OUT BIT_VECTOR (7 DOWNTO 0);
ackRes : IN BIT_VECTOR (7 DOWNTO 0);
init : IN BIT;
VDD : IN BIT;
VSS : IN BIT); -- End Component;

COMPONENT src
PORT ( -- ack : IN BIT;
out : OUT BIT;
outF : OUT BIT;
init : IN BIT); -- End Component;

COMPONENT sink
PORT ( -- ack : OUT BIT;
int : IN BIT;
inF : IN BIT;
it : IN BIT); -- End Component;

SIGNAL Pop, PopF, ackPop : BIT_VECTOR (3 DOWNTO 0);
SIGNAL Sop, SopF, ackSop : BIT_VECTOR (3 DOWNTO 0);
SIGNAL Rest, ResF, ackRes : BIT_VECTOR (7 DOWNTO 0);
SIGNAL ZpF, ZpF, ackzp : BIT_VECTOR (3 DOWNTO 0);
SIGNAL Last, LastF, ackLast : BIT;
BEGIN

srcLast : srcn
PORT MAP (  
outT => LastT, 
outF => LastF, 
ack => ackLast, 
init => init  
);

src0 : srcn
PORT MAP (  
outT => LpT[0], 
outF => LpT[0], 
ack => ackLp[0], 
init => init  
);

src1 : srcn
PORT MAP (  
outT => LpT[1], 
outF => LpT[1], 
ack => ackLp[1], 
init => init  
);

src2 : srcn
PORT MAP (  
outT => LpT[2], 
outF => LpT[2], 
ack => ackLp[2], 
init => init  
);

src3 : srcn
PORT MAP (  
outT => LpT[3], 
outF => LpT[3], 
ack => ackLp[3], 
init => init  
);

src0 : srcn
PORT MAP (  
outT => SopT(0), 
outF => SopF(0), 
ack => ackSop(0), 
init => init  
);

src1 : srcn
PORT MAP (  
outT => SopT(1), 
outF => SopF(1), 
ack => ackSop(1), 
init => init  
);

src2 : srcn
PORT MAP (  
outT => SopT(2), 
outF => SopF(2), 
ack => ackSop(2), 
init => init  
);

src3 : srcn
PORT MAP (  
outT => SopT(3), 
outF => SopF(3), 
ack => ackSop(3), 
init => init  
);

sink0 : sink
PORT MAP (  
int => Rest[0], 
inF => Rest[0], 
ackI => ackRes[0], 
init => init  
);

sink1 : sink
PORT MAP (  
int => Rest[1], 
inF => Rest[1], 
ackI => ackRes[1], 
init => init  
);
ackLast => ackLast.
ResT => ResT.
ResF => ResF.
ackRes => ackRes.
init => init.
odd => odd.
vss => vss.

END structural_view;

sinkReq2 : sink
PORT MAP (  
  int => ReqT(2),
  inF => ReqF(2),
  ackI => ackRes(2),
  init => init  
);

sinkRes3 : sink
PORT MAP (  
  int => ReqT(3),
  inF => ReqF(3),
  ackI => ackRes(3),
  init => init  
);

sinkReq4 : sink
PORT MAP (  
  int => ReqT(4),
  inF => ReqF(4),
  ackI => ackRes(4),
  init => init  
);

sinkRes5 : sink
PORT MAP (  
  int => ReqT(5),
  inF => ReqF(5),
  ackI => ackRes(5),
  init => init  
);

sinkReq6 : sink
PORT MAP (  
  int => ReqT(6),
  inF => ReqF(6),
  ackI => ackRes(6),
  init => init  
);

sinkRes7 : sink
PORT MAP (  
  int => ReqT(7),
  inF => ReqF(7),
  ackI => ackRes(7),
  init => init  
);

vra0 : vra
PORT MAP (  
  CogT => CogT,
  CogF => CogF,
  ackCog => ackCog,
  StepT => StepT,
  StepF => StepF,
  ackStep => ackStep,
  EpT => EpT,
  EpF => EpF,
  ackEp => ackEp,
  LastT => LastT,
  LastF => LastF.
/* File: complib.f 1.1 1996/04/07 21:17:09 with Exp $ 
/* This is code for making assertions for the basic multiring circuit 
/* Components 
let rec vlist n = [] 
/\ vlist n = (a->(int2str (n-1))++)*vlist n (n-1));

/* async latch 
let set_alatch tr fr tl t2 tname = 
{ (name="gtout") is tr from tl to t2 } 0 
{ (name="gout") is fr from tl to t2 } 0 

/* asymmetric and symmetric switches 
let set_switch at af bt bf lastc lastasc t1 t2 sname = 
{ (name="gact") is at from tl to t2 } 0 
{ (name="gact") is af from tl to t2 } 0 
{ (name="gbct") is bt from tl to t2 } 0 
{ (name="gact") is fr from tl to t2 } 0 
{ (name="gact") is (at OR af) AND (NOT lastc) ) from tl to t2 } 0 
{ (name="gact") is (at OR af) AND (NOT lastc) ) from tl to t2 } 0 

/* adder functional unit 
let set_adder at af at cf tl t2 tname = 
{ (name="gadd") is at from tl to t2 } 0 
{ (name="gadd") is af from tl to t2 } 0 
{ (name="gadd") is cf from tl to t2 } 0 

/* and functional unit 
let set_and ot of t1 t2 tname = 
{ (name="gand") is ot from tl to t2 } 0 
{ (name="gand") is of from tl to t2 } 0 

/* NOTE: the ret and regf don't need to be set to avoid X's, so 
/* for sinks which are not important. initialization can be reduced to 
/* (single bit data sinks 
let set_sinki rt rf acki done t1 t2 sname = 
{ (name="reg") is rt from tl to t2 } 0 
{ (name="reg") is rf from tl to t2 } 0 
{ (name="done") is done from tl to t2 } 0 
{ (name="acki") is acki from tl to t2 } 0 

/* Sink which repeats and doesn't store values 
let set_sinki acki t1 t2 sname = ( (name="acki") is acki from tl to t2 ) 0 

/* data sources 
let set_src gt gf data done t1 t2 sname = 
{ (name="gtout") is gt from tl to t2 } 0 
{ (name="gdata") is data from tl to t2 } 0 
{ (name="done") is done from tl to t2 } 0 

/* Automatically sets index to length of data and appends F's to data 
let set_src_data t1 t2 sname = 
let di = length data 
in 
{ (name="gtout") is F from tl to t2 } 0 
{ (name="gdata") is F from tl to t2 } 0 
{ (name="done") is F from tl to t2 } 0 

/* forks 
let set_fork ack tl t2 tname = 
{ (name="acki") is ack from tl to t2 } 0 

/* input i output functional units (F1 and F2 in vector multiplier) 
let set_f1l ot of t1 t2 tname = 
{ (name="gtout") is ot from tl to t2 } 0 
{ (name="gout") is of from tl to t2 } 0 

/* mux 
let set_mux of ot a b c d lastc lastasc t1 t2 sname = 
{ (name="gof") is ot from tl to t2 } 0 
{ (name="goct") is ot from tl to t2 } 0 
{ (name="gact") is acki from tl to t2 } 0 
{ (name="gact") is acki from tl to t2 } 0 
{ (name="gact") is acki from tl to t2 } 0 

/* list versions of above 
let set_alatch tr fr tl t2 l1 = 
let setl l = set_alatch tr fr tl t2 l 
in 
flat ( map setl l1 ) 
let set1l l = set_alatch at af bf lastc lastasc t1 t2 s1 = 
let set1l l = set_switch at af at bf bf lastc lastasc t1 t2 l1 
in 
flat ( map setl f1 ) 
let set_l1l = set_adder at af cf cf tl t2 f1 = 
let set1l l = set_adder at af cf cf tl t2 l1 
in 
flat ( map setl f1 ) 
let set1l f1 = set1l f1 
let set1l l = set_people at af cf cf tl t2 l1 
in 
flat ( map setl f1 ) 
let set1l l = set_people at af cf cf tl t2 f1 = 
let set1l l = set_people at af cf cf tl t2 l1 
in 
flat ( map setl f1 ) 
let set1l l = set_people at af cf cf tl t2 f1 = 
let set1l l = set_people at af cf cf tl t2 l1 
in 
flat ( map setl f1 ) 
let set1l l = set_people at af cf cf tl t2 f1 = 
let set1l l = set_people at af cf cf tl t2 l1 
in 
flat ( map setl f1 ) 
let set1l l = set_people at af cf cf tl t2 f1 = 
let set1l l = set_people at af cf cf tl t2 l1 
in 
flat ( map setl f1 )
let set1_fil ot of t1 t2 f1 =
let set1 = set_fil ot of t1 t2 |
flat ( map set1 f1 );
let set1_mux ot of acks ackb lastc setlastc t1 t2 f1 =
let set1 = set_mux ot of acks ackb lastc setlastc t1 t2 |
flat ( map set1 f1 );

// flatten

let rec enumerate name n e = (n => []) |
  (name|int2str (n-1)):(enumerate name (n-1) => []);

// The only place in the masbit unit which will hold state when the
// environment is only providing B's is L11 coming into sw1, so this
// is the only thing we need to set separately
// However, the P1 and Cyl will hold state in the masblock, so I'll
// include the ability to set these
// Unfortunately, the masbits in upper half of the masblock have
// different steady states from the masbits in the lower half, because
// the ones in the upper half are provided a constant supply of F's while
// the ones in the lower half have to wait for the next parallel operand.

let steady_full_masbit [Pt, Pf, CyT, CyF, Sun] hi mid top t1 t2 mname =
let sinks = [ "src0" ] in
let ends = [ "andn" ] in
let addrs = [ "addn" ] in
let forks = [ "clk", "clkh", "ack", "ackb", "pof", "pofb" ] in
let latches = [ "l12", "l13", "l14", "l16", "l10", "l11" ] in
let addn = map (\n  (n => "mname" ++ "n")
) in

(set_switch F F F F F t1 t2 (mname ++ "sw2" )) 0
(set_latch F F F t1 t2 (addn latches)) 0
(set_fadder F F F t1 t2 (addn addrs)) 0
(set_fadder F F F t1 t2 (addn addrs)) 0
(set_latch CyT CyF t1 t2 (mname ++ "l8" )) 0
(set_latch Sun (NOT Sun) t1 t2 (mname ++ "l7" )) 0
(mname ++ "clat") = t from t1 to t2 0
(mname ++ "clat") = F from t1 to t2 0
(set Switch F hi F hi (NOT hi) t1 t2 (mname ++ "sw1" )) 0
(set_fork F t1 t2 (addn forks)) 0
(mname ++ "srclosel") = hi from t1 to t2 0
(set_src F F F t1 t2 (mname ++ "src0" )) 0

(hl) =>

// State for the upper half
(set_latch F t1 t2 (mname ++ "l11" )) 0
(set_latch F t1 t2 (mname ++ "l15" )) 0
(set_latch F top t1 t2 (mname ++ "l19" ))

// State for the lower half
(set_latch F t1 t2 (mname ++ "l11" )) 0
(set_latch F t1 t2 (mname ++ "l15" )) 0
(set_latch F (NOT mid) t1 t2 (mname ++ "l19" ))

)

let steady_masbit Sun hi mid top sunt t1 t2 mname =
steady_full_masbit [Pt, Pf, F, F, Sun] hi mid top t1 t2 mname;

// Because of the effect of chaining the masbits together, the masblock
This page appears to contain a listing of source code or assembly code, possibly in assembly language, comparing to C or another programming language. The content is divided into sections, each detailing different parts of the code.

The code includes various functions and variables, such as `do_bit`, `do_byte`, `add`, and `sub`. It also references operations like shifts and comparisons, suggesting it is dealing with low-level computing tasks, possibly related to hardware or parallel processing.

Some sections of code are commented for clarity, such as explanations of the operations being performed or the purpose of certain variables.

The overall structure of the code is methodical, with clear demarcations between different parts of the program, indicating it has been written with an eye towards readability and maintainability.

Due to the nature of the code, it's challenging to convert this into a plain text document without losing the context and meaning of the code. The code seems to be a mix of assembly-like syntax and programming constructs, typical of low-level programming languages like C or汇编。
let arccnt (v1, v2) = 
(list (*'src'→'index'*) (length v1)) lev (replicate 0 (length v1))
from PENULT to FOREVER
in
([list (map arccnt (combine srcdist srcml))] @
 (do_arccnt accu sinks | @
 (steady_at final_state T PENULT FOREVER "sm0"));
let vfy = STE "TESTVM[] ant cons[];
//vfy;

let ax n = excitation TESTVM n;
let fo n = fanout TESTVM n;

(* $id: mc.atl.v 1.5 1996/03/30 20:18:50 with Exp with $ *)
(* These are the ST descriptions of the ring circuit cells *)
IMPLEMENTATION MODULE mc:
("FROM Verification IMPORT Initially;")

TYPE
BVEC(n : INTEGER) = ARRAY (i : INDEX(n)) : BOOLEAN;

STATIC
("Concrete cells ")

alatch : aLatchCell =
BEGIN
< NOT (iNT OR iNF ) AND ackO ->
gout, gfout := FALSE, FALSE >>
gout, gfout := int, inf >>
|< outf, outf, acl := gout, gfout, gfout OR gfout >>
END;

aswitch : aSwitchCell =
BEGIN
< NOT (bT OR bF ) AND NOT (cT OR cF ) AND ackAO AND lastC ->
got, gkof, gchab := FALSE, FALSE, FALSE >>
got, gkof, gchab, gko := FALSE, FALSE, FALSE, FALSE >>
|< (bT OR bF ) AND CT AND NOT ackAO ->
got, gkof, gchab, lastC := BT, BF, TRUE, TRUE >>
|< (at OR af ) AND (BT OR BF ) AND CF AND NOT ackAO AND NOT akbo ->
got, gkof, gchab, gkot := TRUE, TRUE, TRUE, FALSE >>
|< acf, acf, acl := gkof, gchab, bof, ack, acch :=
got, gkof, gchab, gkot, gchab, not OR aof >>
END;

f1 : cFCel1 =
BEGIN
< NOT (inOR inOF OR inIt OR inIF OR inIT OR inIF ) AND akC ->
gcout, gcfout := FALSE, FALSE >>
|< (inOR inOF ) AND (inIt OR inIF ) AND (inIT OR inIF ) AND NOT
ack -> gcout, gcfout := (NOT and inf) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and inf)) OR (NOT and in