High-throughput asynchronous dynamic pipelines

Electronic digital logic circuitry – Clocking or synchronizing of logic stages or gates

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details High-throughput asynchronous dynamic pipelines High-throughput asynchronous dynamic pipelines

: 2001-07-12
: 2003-07-08
: Chang, Daniel (Department: 2819)
: Electronic digital logic circuitry
: Clocking or synchronizing of logic stages or gates

: C326S021000, C326S082000
: Reexamination Certificate
: active
: 06590424
: ABSTRACT:

CROSS-REFERENCE TO RELATED APPLICATION
BACKGROUND
1. Field of the Invention
This invention relates to asynchronous pipelines, and more particularly to latchless dynamic asynchronous digital pipelines providing high buffering and high throughput.
2. Background of the Related Art
There has been increasing demand for pipeline designs capable of multi-GigaHertz throughputs. Several novel synchronous pipelines have been developed for these high-speed applications. For example, in wave pipelining, multiple waves of data are propagated between two latches. However, this approach requires significant design effort, from the architectural level down to the layout level, for accurate balancing of path delays (including data-dependent delays), yet such systems remain highly vulnerable to process, temperature and voltage variations. Other aggressive synchronous approaches include clock-delayed domino, skew-tolerant domino, and self-resetting circuits. These approaches require complex timing constraints and lack elasticity. Moreover, high-speed global clock distribution for these circuits remains a major challenge. (See, e.g., “Motorola and Theseus Logic to jointly develop clockless ICs”. http://motorola.com/SPS/MCORE/press
—
19oct99.htm1, October 1999, which is incorporated by reference in its entirety herein.)
Asynchronous design, which replaces global clocking with local handshaking, has the potential to make high speed design more feasible. (See C. H. van Berkel et al., “Scanning the Technology: Applications of Asynchronous Circuits,”
Proceedings of the IEEE,
87(2):223-233, February 1999, which is incorporated by reference in its entirety herein.) Asynchronous pipelines avoid the issues related to the distribution of a high-speed clock, e.g., wasteful clock power and management of clock skew. Moreover, the absence of a global clock imparts a natural elasticity to the pipeline since the number of data items in the pipeline is allowed to vary over time. Finally, the inherent flexibility of asynchronous components allows the pipeline to interface with varied environments operating at different rates; thus, asynchronous pipeline styles are useful for the design of system-on-a-chip.
Asynchronous design has also demonstrated a potential for lower power consumption and lower electromagnetic noise emission. Recent successes include a fully asynchronous 80C51 microcontroller developed by Philips for use in its commercial pagers and cell phones (as described in Hans van Gageldonk et al., “An Asynchronous Low-Power 80C51 Microcontroller,”
Proc. Intl. Symp. Adv. Res. Async. Circ. Syst.
(ASYNC), pp. 96-107, 1998, which is incorporated by reference in its entirety herein), and the AMULET3 asynchronous microprocessor developed at the University of Manchester for use in a commercial telecom product (As described in J. D. Garside et al., “AMULET3i—An Asynchronous System-On-Chip,”
Proc. Intl. Symp. Adv. Res. Async. Circ. Syst.
(ASYNC), pp. 162-175, April 2000, which is incorporated by reference in its entirety herein).
One prior art pipeline is Williams' PS
0
dual-rail asynchronous pipeline (As described in T. Williams,
Self
-
Timed Rings and Their Application to Division,
Ph.D. Thesis, Stanford University, June 1991; T. Williams et al., “A Zero-Overhead Self Timed 160ns 54b CMOS Divider,”
IEEE JSSC,
26(11):1651-1661, Nov. 1991; T. Williams, “Analyzing and Improving the Latency and Throughput Performance of Self-timed Pipelines and Rings,”
Proc. International Symposium on Circuits and Systems,
May 1992; and T. Williams, “Performance of Iterative Computation in Self-Timed Rings,”
Journal of VLSI Signal Processing,
7(½):17-31, February 1994, each of which is incorporated by reference in its entirety herein.).
FIG. 1
illustrates Williams' PS
0
pipeline
10
. Each pipeline stage
12
a,
12
b,
12
c
comprises a dual-rail function block
14
a,
14
b,
14
c
and a completion detector
16
a,
16
b,
16
c.
The completion detectors
16
a,
16
b,
16
c
indicate validity or absence of data at the outputs of the associated function block
14
a,
14
b,
14
c,
respectively.
“Dual-rail” is a commonly-used scheme to implement an asynchronous datapath (See, e.g., M. Josephs et al., “Modeling and Design of Asynchronous Circuits,”
Proceedings of the IEEE,
87(2):234-242, February 1999; and C. Seitz, “System timing,” in
Introduction to VLSI Systems,
Chapter 7, (Carver A. Mead et al., eds., 1980), which are incorporated by reference in their entirety herein.) In dual-rail design, two wires (or rails) are used to implement each bit. The wires indicate both the value of the bit, and its validity. The encodings of
01
and
10
correspond to valid data values 0 and 1, respectively. The encoding
00
indicates the reset or spacer state with no valid data, and
11
is an unused (illegal) encoding. Encodings on the datapath typically alternate between valid values and the reset state. Since the datapath itself indicates the validity of each bit, dual-rail is effective in designing asynchronous datapaths which are highly robust in the presence of arbitrary delays. In the exemplary embodiment, stage
12
a,
12
b,
12
c
receives dual-rail input
13
a,
13
b,
13
c
and provides dual-rail output
15
a,
15
b,
15
c,
respectively. Dual-rail output
15
a
of stage
12
a
passes data to dual-rail input
13
b
of stage
12
b.
Each function block
14
a,
14
b,
14
c
is implemented using dynamic logic. A precharge/evaluate control input (PC) of each stage is tied to the output of the next stage's completion detector. For example, the precharge/evaluate control input (PC), of stage
12
a
is tied to the completion detector
16
b
of stage
12
b
and is passed to function block
14
a
on line
18
a.
(Similarly, the precharge/evaluate control input (PC) of stage
12
b
is tied to the completion detector
16
c
of stage
12
c
and is passed to function block
14
b
on line
18
b.
) A precharge logic block can hold its data outputs even when its inputs are reset, it also provides the functionality of an implicit latch. Therefore, a stage
12
a,
12
b,
12
c
has no explicit latch.
FIG. 2
illustrates function block
14
b.
Although function blocks
14
a
and
14
c
are not illustrated, they are substantially identical to function block
14
b,
as is known in the art.
FIG. 2
illustrates how a dual-rail AND gate, for example, would be implemented in dynamic logic; the dual-rail output
15
b
(f
1
and f
0
) implements the AND of the dual-rail inputs
13
b
(a
1
a
0
and b
1
b
0
).
The completion detector
16
a,
16
b,
16
c
at each stage
12
a,
12
b,
12
c,
respectively, signals the completion of every computation and precharge. An exemplary completion detector
16
b
is illustrated in FIGS.
3
(
a
)-
3
(
b
). As illustrated in FIG.
3
(
a
), a C-element
17
b
to combine all the results (See, FIG.
3
). (Further details of the C-element are described in I. E. Sutherland. Micropipelines.
Communications of the ACM,
32(6):720-738, June 1989, which is incorporated by reference in its entirety herein.). A C-element is a basic asynchronous stateholding element. More particularly, the output of an n-input C-element is high when all inputs are high, and is low when all inputs are low. If the inputs are not all high or all low, the C-element holds its previous value. It is typically implemented by a CMOS gate with an N-input series stack in both pull-up and pull-down, and an inverter on the output (with weak feedback inverter attached to maintain state). As illustrated in FIG.
3
(
b
), the validity, or non-validity, of the data outputs
15
b
is checked by OR'ing the two rails for each individual bit using OR elements
17
b,
and then using the C-element
19
b
to combine all the results to create the done signal
18
a.
The sequencing of pipeline control for the Williams' PSO dual-rail pipeline is as follows: Stage N is precharged when stage N+1 finishes evaluation. Stage N evaluates when stage N+1 finishes precharge. Actual evaluation will commence only after valid data in

Affiliated with

Nowick Steven M.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Singh Montek

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Baker & Botts LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Chang Daniel

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

The Trustees of Columbia University in the City of New York

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

High-throughput asynchronous dynamic pipelines does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with High-throughput asynchronous dynamic pipelines, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High-throughput asynchronous dynamic pipelines will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3098503

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure