Independent sequencers in a DRAM control structure

Electrical computers and digital processing systems: memory – Storage accessing and control – Access timing

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S005000

Reexamination Certificate

active

06836831

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to Dynamic Random Access Memory (DRAM) controllers. In particular, the present invention discloses an improved memory controller that provides for better memory data bus utilization for a random series of memory accesses.
DESCRIPTION OF RELATED ART
Digital data processing products comprise one or more processors. These processors are electrically coupled to input/output devices such as disk storage, tape storage, keyboards, and displays, for examples. The processors are also coupled to a memory. The memory is often configured as a hierarchy to provide a tradeoff between the costs of each level in the hierarchy, the size of each level, the access time to receive data from each level, and the bandwidth available to transfer data to or from each level.
For example, a level-
1
cache (L
1
cache) is usually placed physically on the same chip as a processor. Typically the processor can access data from L
1
cache in one or two processor clock cycles. L
1
cache is normally optimized for latency, meaning that the primary design goal is to get data from the L
1
cache to the processor as quickly as possible. L
1
caches are usually designed in Static Random Access Memory (SRAM) and occupy a relatively large amount of space per bit of memory on the semiconductor chip. As such, the cost per bit is high. L
1
caches are typically designed to hold 32,000 bytes (32 KB) to 512 KB of data.
A level-
2
(L
2
cache) is normally designed to hold much more information than an L
1
cache. The L
2
cache usually contains 512 KB to 16,000,000 bytes (16 MB) of data storage capacity. The L
2
cache is typically also implemented with SRAM memory, but in some cases, is implemented as DRAM. The L
2
cache typically takes several cycles to access.
A level-
3
(L
3
cache) is normally designed to hold much more information than an L
2
cache. The L
3
cache typically contains from 16 MB to 256 MB, and is commonly implemented with DRAM memory. The L
3
cache is frequently on separate semiconductor chips from the processor, with signals coupling the processor with the L
3
cache. These signals are routed on modules and printed wiring boards (PWB's).
A main memory is almost always implemented in DRAM memory technology, and is optimized for low cost per bit, as well as size. Today's large computers have main memory storage capacities of many gigabytes.
FIG. 1
shows a high-level block diagram of a computer. The computer comprises one or more processors. Modern computers may have a single processor, two processors, four processors, eight processors,
16
processors, or more. Processors
2
A-
2
N are coupled to a memory
6
by a memory controller
4
. Memory
6
can be any level of cache or main memory; in particular, memory
6
is advantageously implemented in DRAM for the present invention. A processor data bus
3
couples processors
2
A-
2
N to memory controller
4
. A memory data bus
5
couples memory controller
4
to memory
6
. Optimizing the use of the bandwidth available on memory data bus
5
is important to maximize the throughput of the computer system. Memory data bus
5
should not be idle when there are outstanding requests for data from processors
2
A-
2
N. A conventional memory controller comprises a number of command sequencers
8
. Each command sequencer
8
manages one request at a time (a load request or a store request), and the command sequencer
8
, when in control of memory data bus
5
, is responsible for driving the Row Address Strobe (RAS), the Column Address Strobe (CAS), and any other associated control signals to memory
6
over memory data bus
5
. Control typically passes from one command sequencer
8
to another command sequencer
8
in a round robin fashion. Memory controller
4
strives to make sure that each command sequencer
8
has a request to handle, to the degree possible in the current workload.
FIG. 2
is a more detailed view of memory
6
, showing that memory
6
comprises banks bank
0
, bank
1
, bank
2
, and bank
3
. Four banks are shown for exemplary purposes, but more or fewer banks could be implemented in a particular design. Each bank has timing requirements that must be complied with. In some applications, e.g., numeric intensive applications, a particular type of DRAM, the Synchronous DRAM (SDRAM) can be operated in page mode, with many accesses to the same page, where a page is the same as a bank. Commercial workloads have a high percentage of random accesses so page mode does not provide any performance benefit. In non-page mode, SDRAMs are designed for peak performance when consecutive accesses are performed to different banks. A read is first performed by opening a bank with a RAS (Row Address Strobe) to open a bank, waiting the requisite number of cycles, applying a CAS (Column Address Strobe), waiting the requisite number of cycles, after which the data is transmitted from the bank into the memory controller
4
. Memory controller
4
must wait several cycles for the row in the bank to precharge (tRP) before reactivating that bank. A write is performed by opening a bank (RAS), issuing a write command along with a CAS, and transmitting data from memory controller
4
to the SDRAMs in the opened bank. That bank cannot be re-accessed until a write recovery (tWR) has elapsed, as well as the row precharge time (tRP).
Switching the SDRAM data bus from performing a read to a write is expensive in terms of time, requiring the amount of time to clear the data bus of the read data from the last read command. When switching from writes to reads, the write data must be sent to the SDRAMs and the write recovery time must complete before a read command can be sent. The penalty incurred when switching from reads to writes, or writes to reads, is called the bus turnaround penalty.
FIGS. 3A-3E
provide an example, using reads, showing how bandwidth on memory data bus
5
can be wasted if data from a particular bank is repeatedly accessed.
FIG. 3A
lists the timing rules in the example. RAS-CAS delay is 3 cycles. RAS-RAS delay, when the same bank is being addressed is 11 cycles. CAS-RAS delay, when addressing a different bank is one cycle. CAS-data delay is 3 cycles. A data transmittal, seen in
FIGS. 3B-3E
requires four bus cycles.
FIG. 3B
shows the sequential use of a single bank. Data A and data B are presumed to be in the same bank. That bank is opened with a RAS at cycle
1
. The CAS is on cycle
4
. Data is transmitted from that bank over memory data bus
5
to memory controller
4
during cycles
7
,
8
,
9
, and
10
. Because of the RAS-RAS 11-cycle requirement when the same bank is addressed, the bank cannot be opened again to read data B until cycle
12
. The CAS for reading data B is sent on cycle
15
, and data B is transmitted from that bank over memory data bus
5
to memory controller
4
on cycles
18
,
19
,
20
, and
21
. Note that, in this example, memory data bus
5
is not utilized on cycles
11
,
12
,
13
,
14
,
15
,
16
, and
17
. As stated above, memory data bus
5
is used far more efficiently when consecutive accesses are to different banks.
FIG. 3C
shows optimal memory data bus
5
usage when consecutive reads are to different banks. Requests A, B, C, and D are for data in separate banks. The RAS for data A is sent at cycle
1
; the CAS for data A is sent at cycle
4
. The RAS for data B can be sent at cycle
5
, per the rules given in FIG.
3
A. The CAS for data B is sent at cycle
8
. Similarly, the RAS and CAS for data C are sent on cycles
9
and
12
. The RAS and CAS for data D are sent on cycles
13
and
16
. Memory data bus
5
is kept 100% busy once data transmittal has started.
FIG. 3D
shows a case where requests for A, B, C, and D are consecutive requests from processors
2
A-
2
N, but where data A and data C are in the same bank. Using the timing requirements of
FIG. 3A
, the bank containing data C cannot be reopened until the 12th cycle. This causes a 3-cycle gap in memory data bus
5
utilization, as shown in FIG.
3
D.
FIG. 3E
shows how memory acce

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Independent sequencers in a DRAM control structure does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Independent sequencers in a DRAM control structure, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Independent sequencers in a DRAM control structure will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3332897

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.