Electrical computers and digital data processing systems: input/ – Input/output data processing – Data transfer specifying
Reexamination Certificate
1999-05-17
2002-09-24
Kim, Matthew (Department: 2186)
Electrical computers and digital data processing systems: input/
Input/output data processing
Data transfer specifying
C711S217000
Reexamination Certificate
active
06457075
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to computer systems and, more particularly, to memory controllers for computer systems. A major objective of the invention is to enhance overall performance of multi-master computer systems by avoiding some latencies incurred due to differences in optimal burst lengths among masters.
Much of modern progress is associated with advances in computer technology that have provided increasing speed and functionality. These advances have occurred both on the level of individual integrated circuits and on the systems integration level. Integrated circuits have become faster and have accommodated more functions per circuit. Systems have provided for increasing parallelism in the utilization of integrated circuits, as well as more efficient communication among integrated circuits.
A basic computer system includes a data processor for manipulating data in accordance with program instructions; both the data and the instructions can be stored in a memory system. There can be several levels of memory. Main memory is typically some form of random access memory (RAM) residing on a different integrated circuit than the processor resides. Typically, a computer has one or more bulk storage memories—usual disk-based serial access memories such as floppy disks, hard disks, CD-ROMs, etc. The capacity of the bulk storage devices typically exceeds that of main memory, but the access times are much slower. Thus, when a program is to be executed, the required instructions and the required data are loaded from the bulk storage into main memory for faster execution.
While main memory is much faster than bulk memory, accessing main memory tends to be a bottleneck from the perspective of the processor. A typical read cycle, for example, involves the processor asserting an address, selecting memory or other device associated with that address, reception and decoding of the address by the memory, and, finally, access and transmission of the contents at the addressed location to the processor. Such a read operation can consume several processing cycles.
Write operations, in which the processor writes data to memory, can be faster since the processor can transmit the data at the same time the address is transmitted. Thus, while, both read and write operations between a processor and main memory can limit processor throughput, the emphasis herein is on the relatively more time-consuming read operations.
Caches reduce the delays involved in main memory accesses by storing data and instructions likely to be requested by a processor in a relatively small and fast memory. There can be multiple levels of cache, e.g., a smaller, faster, level-one (L1) cache and a larger, slower, level-two (L
2
) cache. A typical read operation involves transmitting a read request to the L1 cache, the L
2
cache, and main memory concurrently. If the requested data is found in the L1 cache, the processor's request is satisfied from the L1 cache and the accesses of the L
2
cache and main memory are aborted. If the data is not found in the L1 cache, but is found in the L
2
-cache, the data is provided by the L
2
cache and the access of main memory is aborted. If the requested data is not in either cache, the request is fulfilled by main memory.
An L
2
cache typically controls requests by a processor targeted for main memory. The L
2
cache typically converts a request for data at a single address location to a request for data at a series of, e.g., four, address locations. The cache stores the requested data along with neighboring data on the assumption that the processor is relatively likely to recall previously requested data or to request data stored near previously requested data.
While the presence of a cache improves the availability of data to the processor, the longer access times associated with the fetching of lines including uncached data limit performance. If a cache controller has to send multiple, e.g., four, addresses, for each line to be cached, the four associated access cycles can be a burden to performance. In particular, there can be an access latency associated with each main memory access so that each line access would involve multiples of such latencies.
Modern “synchronous dynamic random-access memories” (SDRAMs) typically employ two features designed to minimize the compounding of access latencies. The first feature is pipelined processing in which a read request can be received while a previous read request is being processed. With pipelining there is typically a latency of two or more system-bus cycles associated with the first access, but subsequent sequential accesses do not add to that latency beyond a typical baseline of one system-bus cycle per address.
If the system bus is also pipelined, the master (e.g., the processor/cache system) can send four addresses in quick succession and receive the requested data without an inter-request delay. However, many system buses and many processors are not designed take full advantage of memory pipelining. When the bus is not pipelined and often even when it is, a master must wait until one request is fulfilled before issuing the next request.
To take advantage of a pipelined memory despite limitations in the system bus or processor, SDRAMs can provide for multi-address burst modes. In such a mode, an SDRAM provides the contents not only of the requested address but also of succeeding addresses. For example, in a burst-4 mode, an SDRAM provides the data at the requested address and the data at the next three consecutive addresses.
In principle, by setting the burst length equal to the cache line length, a cache could receive a complete line in response to a single address request. However, many systems provide for exceptional circumstances (e.g., a “non-cacheable read” instruction) in which only one address is to be read. If the system cannot tolerate unrequested data on the system bus, then burst-
4
mode is problematic. The burst-1 mode avoids this problem, but introduces multi-cycle latencies in single-cache-line fetches.
U.S. Pat. No. 5,802,597 to Nelsen, “Nelsen” herein, discloses a system that provides for single address accesses while a memory is in burst-4 mode. The memory controller forwards the first address to the memory—which then begins the burst. When the data from the first address is received by the master (the processor/cache combination), the second address can be asserted. If the second address is asserted (confirming the corresponding address as generated in the burst), the burst is allowed to continue. If the second address is not asserted (disconfirming the second address as generated in the burst), the burst is aborted.
To effect such an abort, the connection between the memory and the system bus can be broken and the system bus tri-stated. The memory pipeline can be cleared and the memory outputs can be cleared. This abort procedure can consume a cycle or two. Depending on the situation, this abort delay might or might not affect performance. In the worst case, if a write operation I s asserted right after the read operation, the write operation could suffer a latency corresponding to that imposed by the abort. However, this cost can be more than offset where the single-address accesses are infrequent relative to the four-address accesses.
The optimal burst length depends on the master. For example, the optimal burst length can be four for a master with a four-word-wide cache, while the optimal burst length can be eight for a master with an eight-word-wide cache. Systems with multiple masters having different optimal burst lengths can provide for changing the burst mode to match the current master.
Typically, changing the burst mode involves executing a write instruction, e.g., part of a driver program or subroutine, to write a burst value in a burst-mode register of the SDRAM memory. Thus, changing the burst mode can involve calling a subroutine as well as executing the included burst-value write instruction. A burst mode switch can consume several bus cycles. If masters are cha
Chace Christian P.
Kim Matthew
Koninkijke Philips Electronics N.V.
Zawilski Peter
LandOfFree
Synchronous memory system with automatic burst mode... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Synchronous memory system with automatic burst mode..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Synchronous memory system with automatic burst mode... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2854533