Electrical computers and digital data processing systems: input/ – Intrasystem connection – Bus access regulation
Reexamination Certificate
1999-03-05
2001-12-25
Myers, Paul R. (Department: 2181)
Electrical computers and digital data processing systems: input/
Intrasystem connection
Bus access regulation
C710S060000
Reexamination Certificate
active
06334163
ABSTRACT:
TECHNICAL FIELD
The present invention relates in general to data processing systems, and in particular, to the interface between dynamic, or clocked, integrated circuit chips in a data processing system.
BACKGROUND INFORMATION
Modern data processing systems require the transfer of data between dynamic, or clocked, circuits embodied in multiple chips in the system. For example, data may need to be transferred between central processing units (CPUs) in a multi-CPU system, or between a CPU and the memory system which may include a memory controller and off-chip cache. Data transfers are synchronous, and data is expected to be delivered to the circuitry on the chip on a predetermined system cycle. As CPU speeds have increased, the speed of the interface between chips (bus cycle time) has become the limiting constraint as the latency across the interface exceeds the system clock period. In order to maintain system synchronization, the system designer must slow the speed of the bus in order that the cycle on which data arrives be unambiguous.
This may be further understood by referring to
FIG. 1A
, in which is depicted, in block diagram form, a prior art interface between two integrated circuit chips, chip
102
and chip
104
in a data processing system. Each of chips
102
and
104
receive a reference clock
106
coupled to a phase lock loop, PLL
108
. PLL
108
generates a local clock, clock
110
in chip
102
and clock
111
in chip
104
, locked to reference clock
106
. Reference clock
106
provides a “time zero” reference, and may be asserted for multiple periods of local clocks
110
and
111
, depending on the multiplication of PLL
108
. The bus clock
113
is derived from reference clock
106
by dividing local clock
110
by a predetermined integer, N, in divider
112
. Data to be sent from chip
102
to chip
104
is latched on a predetermined edge of the divided local clock
110
and driven on to data line
116
via driver
118
. Data is received at receiver (RX)
120
and captured into destination latch
122
on a predetermined edge of the divided local clock
111
in chip
104
. Due to the physical separation of chip
102
and chip
104
, the data appears at input
124
of destination latch
122
delayed in time. (The contribution of RX
120
to the latency is typically small relative to the delay due to the data transfer.) The time delay is referred to as the latency, and will be discussed further in conjunction with FIG.
1
B.
Similarly, chip
104
sends data to chip
102
via data line
126
. Data to be sent from chip
104
is latched in latch
128
on a predetermined edge of the output signal from divider
130
which divides local clock
111
by N. The data is driven onto data line
126
via driver
132
and captured on destination latch
134
via receiver
136
. The data input to chip
102
is captured into data latch
134
on a predetermined edge of an output of divider
130
which also divides local clock
110
by N.
In
FIG. 1B
, there is illustrated an exemplary timing diagram for interface
100
of
FIG. 1A
, in accordance with the prior art. Data
115
sent from chip
102
to chip
104
is latched, in latch
114
, on a rising edge, t
1
, of bus clock
113
. Bus clock
113
is generated by dividing local clock
110
by N in dividers
112
and
130
in chip
102
. Following a delay by the latency, T
1
, data
117
appears at an input to destination latch
122
, and is latched on rising edge t
2
of bus clock
123
. Bus clock
123
is generated by dividing local clock
111
by N in dividers
112
and
130
in chip
104
. Thus, in the prior art in accordance with
FIG. 1B
, data
125
appears in chip
104
one bus cycle following its launch from chip
102
. In
FIG. 1B
, there is zero skew between bus clock
113
and bus clock
123
.
If, in interface
100
in
FIG. 1A
, the bus clock speed is increased, the latency may exceed one bus clock cycle. Then the exemplary timing diagram illustrated in
FIG. 1C
may result. As before, data
115
has been latched on edge t
1
of bus clock
113
. Data
117
appears at input
124
of destination latch
122
after latency time, T
1
which is longer than the period of bus clock
113
and bus clock
123
. Data
117
is latched on edge T
3
of bus clock
123
in chip
104
to provide data
125
on chip
104
. If interface
100
between chips
102
and
104
represents the interface having the longest latency from among a plurality of interfaces between chip
102
and the plurality of other chips within a data processing system, then the two cycle latency illustrated in
FIG. 1C
represents the “target” cycle for the transmission and capture of data between chips, such as chip
102
and chip
104
. The target cycle is the predetermined cycle at which data is expected by the chip. Interfaces having a shorter latency may need to be padded, in accordance with the prior art, in order to ensure synchronous operation. The padding ensures that faster paths in interface
100
have latencies greater than one bus clock cycle and less than two bus clock cycles, whereby data synchronization may be maintained.
This may be further understood by referring now to
FIG. 1D
, illustrating a plurality
101
of chips, chips
102
,
103
and
104
. Chip
102
and chip
104
are coupled on “slow” path
152
having a long latency, T
S
. Chip
103
is coupled to chip
102
via “fast” path
154
having a short latency period, T
F
. A “nominal” path coupling plurality
101
of chips
102
-
105
has latency T
M
, such as the latency on path
156
between chip
102
and chip
105
.
The timing diagram in
FIG. 1E
provides further detail.
FIG. 1E
illustrates a timing diagram similar to that in
FIG. 1C
in which the target cycle for the capture of data into a receiving chip is two bus cycles. In
FIG. 1E
, the nominal latency, T
M
, is shown to be 1.5 bus cycles, the fast path latency, T
F
, is illustrated to be just greater than one bus cycle, and the slow path latency, T
S
, is shown to be slightly less than two bus cycles. In this case, each of the plurality of chips
101
in
FIG. 1D
capture data on the target cycle, two bus cycles after data launch.
If, however, the fast path is shorter, illustrated by fast path latency T′
F
data synchronization is lost. In this case, data arrives at chip
103
prior to transition T
2
of the chip
103
bus clock as illustrated by the dotted portion of data
117
at chip
103
, and is latched into chip
103
after one bus cycle. This is illustrated by the dotted portion of data
125
in chip
103
. In order to restore synchronization, the fast path, path
154
, between chips
102
and
103
would require padding to increase the fast path latency, from T′
F
to T
F
. Consequently, the timing of such a prior art interface is tuned to a specific operating range, a particular interface length, and is valid only for the technology for which the design was timed and analyzed.
Likewise, increasing the clock speed of the chips in
FIG. 1D
will result in a loss of synchronization. This may be understood by considering an explicit example. The local clock cycle time is first taken have a 1 nanosecond (ns) period. The bus clock will have a period that is a fixed multiple, which will be taken to be two, of the local clock. Let the nominal latency of the interface, T
M
, be 3 ns with ±0.99 ns of timing variation, i.e. the best case or fast path, T
F
, is 2 ns and the worse case, or slow path, T
S
, is 4 ns. The data will arrive after two ns and before four ns. Hence the interface will operate under all conditions i.e. data is guaranteed to arrive after the first bus cycle and before the second bus cycle. However if the speed of the chips is increased to a 0.9 ns cycle time, the bus cycle time is changed to 1.8 ns. In order to ensure enough time for the data to propagate across the interface under worse case conditions the data must not be captured before 2.5 bus cycles, or 4.5 ns, because two bus cycles is less than the slow path time, T
S
, or 4 ns. Then, in order to operate a 1.8 ns bus cycle, the
Dreps Daniel Mark
Ferraiolo Frank David
Gower Kevin Charles
International Business Machines Corp.
McBurney Mark
Myers Paul R.
Newberger Barry S.
Winstead Sechrest & Minick P.C.
LandOfFree
Elastic interface apparatus and method therefor does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Elastic interface apparatus and method therefor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Elastic interface apparatus and method therefor will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2561802