Electrical computers and digital processing systems: memory – Storage accessing and control – Access timing
Reexamination Certificate
1998-10-30
2003-01-21
Nguyen, Than (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Access timing
C711S168000, C711S169000, C713S400000, C713S401000, C713S501000, C713S503000
Reexamination Certificate
active
06510503
ABSTRACT:
The present invention relates to computer memory interfaces and more specifically to chip-to-chip interfaces for dynamic random access type memories capable of operating at high speed. This application incorporates herein by reference Canadian Patent Application Number 2,243,892 filed on Jul. 27, 1998.
BACKGROUND OF THE INVENTION
The evolution of the dynamic random access memories used in computer systems has been driven by ever-increasing speed requirements mainly dictated by the microprocessor industry. Dynamic random access memories (DRAMs) have generally been the predominant memories used for computers due to their optimized storage capabilities. This large storage capability comes with the price of slower access time and the requirement for more complicated interaction between memories and microprocessors/microcontrollers than in the case of say static random access memories (SRAMs) or non-volatile memories.
In an attempt to address this speed deficiency, various major improvements have been implemented in DRAM design, all of which are well documented. DRAM designs evolved from Fast Page Mode (FPM) DRAM to Extended Data Out (EDO) DRAMs synchronous DRAMs (SDRAMs). Further speed increases have been achieved with Double Data Rate (DDR) SDRAM, which synchronizes data transfers on both clock edges. However, as the speed requirements from the microprocessor industry continue to move ahead, new types of memory interfaces have led to be contemplated to address the still existing vast discrepancy in speed between the DRAMs and microprocessors.
Recently, a number of novel memory interface solutions aimed at addressing the speed discrepancy between memory and microprocessors have been presented.
Several generations of high bandwidth DRAM-type memory devices have been introduced. Of note is Rambus Inc which first introduced a memory subsystem in which data and command/control information is multiplexed on a single bus and described in U.S. Pat. No. 5,319,755 which issued Jun. 7, 1994. Subsequently, Concurrent Rambus™ was introduced which altered the command/data timing but retained the same basic bus topology. Finally, Direct Rambus™ described in R. Crisp “Direct Rambus Technology: The New Main Memory Standard”, IEEE Micro, November/December 1997, p.18-28, was introduced in which command and address information is separated from data information to improve bus utilization. Separate row and column command fields are provided to allow independent control of memory bank activation, deactivation, refresh, data read and data write (column) commands. All three Rambus variations however share the same bus topology as illustrated in FIG.
1
(
a
).
In this topology a controller
10
is located at one end of a shared bus
12
, while a clock driver circuit
14
and bus terminations
16
are located at an opposite end. The shared bus includes, data and address/control busses, which run from the controller at one end to the various memory devices MEMORY
1
. . . MEMORY N and the terminations at the far end. The clock signal generated by the clock driver
14
begins at the far end and travels towards the controller
10
and then loops back to the termination at the far end. The clock bus is twice as long as the data and address/control busses. Each memory device has two clock inputs ClkFromController and ClkToController respectively, one for the clock traveling towards the controller cTc, and another for the clock traveling away from the controller cFc towards the termination. When the controller
10
reads from a memory device, the memory device synchronizes the data it drives onto the bus with the clock traveling towards the controller. When the controller is writing to a memory device, the memory device uses the clock traveling away from the controller to latch in data. In this way the data travels in the same direction as the clock, and clock-to-data skew is reduced. The memory devices employ on-chip phase locked loops (PLL) or delay locked loops (DLL) to generate the correct clock phases to drive data output buffers and to sample the data and command/address input buffers.
There are a number of shortcoming with this topology as will be described below.
For the bus topology of FIG.
1
(
a
) the clock frequency is 400 MHz. FIG.
1
(
b
) shows the timing of control and data bursts on the bus
12
. Since data is transmitted or received on both edges of the clock, the effective data rate is 800 Mb/s. A row command ROW burst consists of eight (8) consecutive words, beginning on a falling edge of the clock from the controller cFc and applied on the three (3) bit row bus. A column command COL consists of eight (8) consecutive words transmitted on the five (5) bit column bus. Independent row and column commands can be issued to the same or different memory devices by specifying appropriate device identifiers within the respective commands. At the controller
10
the phases of the two clock inputs, cFc and cTc, are close together. There is a delay to the memory chip receiving the commands due to finite bus propagation time, shown in
FIG. 1
as approximately 1.5 bit intervals or 1.875 ns. The clock signal cFc propagates with the ROW and COL commands to maintain phase at the memory inputs. Read data resulting from a previous COL command is output as a burst of eight (8) consecutive 16 or 18 bit words on the data bus, starting on a falling edge of cTc. The data packet takes roughly the same amount of time to propagate back to the controller, about 1.5 bit intervals. The controller spaces COL command packets to avoid collisions on the databus. Memory devices are programmed to respond to commands with fixed latency. A WRITE burst is driven to the databus two bit intervals after the end of the READ burst. Because of the finite bus propagation time, the spacing between READ and WRITE bursts is enlarged at the memory inputs. Likewise, the spacing between a WRITE and READ burst would be smaller at the memory device than at the controller.
For example, there is a summation of clock-to-data timing errors in transferring data from one device to another. FIG.
2
(
a
) is a schematic diagram of the loop-back clock, data lines and clock synchronization circuit configuration. In this configuration, the bus clock driver
14
at one end of the ClockToController line
22
of the clock bus propagates an early bus clock signal in one direction along the bus, for example from the clock
14
to the controller
10
. The same clock signal then is passed through the direct connection shown to a second line
24
of the bus loops back, as a late ClockFromController along the bus where it terminates with resistance R
term
. Thus, each memory device
26
receives the two bus clock signals at a different time. The memory device
26
includes a clock and data synchronization circuit for sampling the two bus clocks cFc and cTc and generating its own internal transmit and receives clocks TX_clk and RX_clk respectively, for clocking transmit and receive data to and from the databus respectively. The bus clock signals cFc and cTc are fed via respective input receiver comparators
11
and
20
into corresponding PLL/DLL circuits
40
and
50
. For the input of data from the controller to a memory device, the role of the on-chip PLL/DLL circuit
40
is to derive from the cFc clock input, internal clocks to sample control, address, and data to be written to the memory on (positive 90° and negative 270°) edges of the clock, at the optimum point in the data eye. These internal receive data clocks may also be used to drive the internal DRAM core
32
. For the output of data from the memory device
26
to the controller
10
, the role of the on-chip PLL/DLL circuit
50
is to derive from the cTc clock input internal transmit data clocks (0° and 180°) to align transmitted data (read data from the memory core) with the edges of the external clock.
The data I/O pin has an output transistor
27
for driving the data bus. An actual memory device will have 16 or 18 such data pins. The other data pins are not shown in FIG.
2
(
a
) for simplicity.
Gillingham Peter
Millar Bruce
Mosaid Technologies Incorporated
Nguyen Than
Pillay, Kevin Fasken Martineau DuMoulin LLP
LandOfFree
High bandwidth memory interface does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with High bandwidth memory interface, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High bandwidth memory interface will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3061025