Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1998-02-17
2001-01-09
Robertson, David L. (Department: 2759)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S130000
Reexamination Certificate
active
06173369
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a data processor having a hierarchial memory arrangement including a cache memory to speed data retrieval. More specifically, the present invention relates to an improved system and method for cache control and management which processes out-of-order return data and instructions, and/or multiple fetch requests. According to another aspect of the present invention, sets of cache memory can be checked for multiple requests. Multiple instructions and/or data can then be returned to fulfill more than one request at a time.
2. Related Art
Increases in processor speed have led to the development of memory devices having very fast access times. The cost of such memory devices, however, is often proportionally related to the speed a(which the devices can be accessed. Thus, to store all of a processor's data and the program instructions in very fast memory can lead to very high memory costs.
To minimize the cost associated with high-speed memory while still reaping the benefits of fast access times, system designers have implemented cache memories. In a cache memory system, the majority of instructions and program data are stored in standard memory such as a disc, hard drive, or low speed random access memory (RAM). A relatively small amount of high-speed memory, called a cache, is provided to store a subset of the program data and/or instructions. In this patent document, the term data, when used in reference with storage in the cache, is used to generally refer to either instruction execution data or to program instructions.
Typically, those data that are most frequently accessed by the processor are stored in the cache. As a result, these data can be accessed by the processor at a much faster rate. Additionally, some systems implement an instruction prefetch, wherein instructions are fetched from low-speed storage in advance and stored in the cache. As a result, the instructions are already in cache when needed by the processor and can be accessed quickly.
System designers frequently implement cache storage to speed access times for data and instructions. In such systems, a cache control unit is often implemented to manage the storage of data in the cache and provide the data to the processor. In these systems, the instruction fetch unit and instruction execution unit go to the cache control unit to request the required data. Upon receipt of a request, the cache control unit first searches the cache for the data requested. If the requested data exist in the cache, the data are provided to the requesting unit from the cache. This condition is known as a cache hit. If the data are not present in the cache, the cache control unit retrieves the data from storage and stores the data in a known location in the cache.
Instruction requests are often handled in sequence according to their order in a particular application program. Bottlenecks and delays occur as outstanding requests, which might otherwise be executed more quickly, wait for preceding slower instructions to be processed.
For example, in multi-processor systems sharing two or more buses, a central processor unit (CPU) stall condition arises when one of the buses is occupied in servicing a particular processor. When data requests or instructions are executed in sequence, the other processors depending on the occupied bus wait for the bus to become available before proceeding to process other data requests and instructions, regardless of the availability of other buses.
Delays can further arise when sequential data requests and instructions are made to wait for data return from storage devices which have a different rate of data return. For instance, data return from faster lower-level storage devices such as dynamic random access memory (DRAM) still must wait for preceding data requests made to slower lower-level storage devices such as an input/output (I/O) device.
To improve efficiency and speed in processing program instructions, many contemporary processors are capable of dynamic scheduling of instructions. Dynamic scheduling means that instructions can be retrieved from memory, scheduled, and executed, all in an order that is different from the program order. At a general level, dynamic scheduling allows a pipelined processor to maximize the utilization of its resources by prioritizing the use of the resources between the multiple processors. Such dynamic scheduling, however, does not consider the potential disparity between the storage devices or the resources themselves in executing instructions or data requests. The CPU stalls encountered when a bus is servicing a particular processor is also not overcome.
The inventors have discovered that there is a need for optimizing instruction execution and data requests after a cache miss. In particular, there is a need for accommodating out-of-order data returns and servicing multiple requests to increase the speed and efficiency of data retrieval. In multiple-processor systems sharing multiple buses it is especially desirable to avoid CPU stall. Further, it is desirable to avoid bottleneck and to quickly and efficiently accommodate data requests made to a variety of lower-level storage devices with different response times.
SUMMARY OF THE INVENTION
The present invention optimizes the processing of data requests and instructions after cache misses. By converting CPU requests into cache control requests, a sequence of multiple requests for data can be sent to various lower-level storage devices or shared buses. These cache control requests include cache control unit identification tags (CCU-ID tags) for identifying individual requests.
Returning data is then received and forwarded to the requesting CPU in the order of return provided by the lower-level storage devices or shared buses.
In this manner, data is retrieved after cache misses more quickly and efficiently than processing data requests in sequence. Because the cache control ID tags allow individual requests to be distinguished, multiple requests for data can be pending at the same time. Such requests can also be received “out of order” from lower-level memory, that is, in the order of which a storage device returns the data being sought.
Out-of-order data return is especially advantageous in a system having numerous memory or storage devices. For example, such a system may include DRAM, read-only memory (ROM), electrically programmable read-only memory (EPROM) disk storage, and I/O devices. Typically, it is expected that data would be returned from DRAM quicker than they would be returned from disk, or from I/O. Thus, a request from the CPU for data from I/O would not create a bottleneck as in prior systems, as subsequent requests to devices such as a DRAM with a faster rate of return can still be filled.
Conventional systems were limited in their ability to return data out of order due to their lack of a technique for tracking the data returned. Retaining the CPU-ID identifying tag with the data request greatly increased the numbers of bits transmitted for each request. Such overhead costs were prohibitive. Without the ability to track the data being returned, the conventional cache control unit would not know which data correspond to which request.
According to one embodiment of the invention, a sequence of CPU requests for data not previously found in cache memory, is stored in a request queue. Each CPU request includes a CPU-ID tag identifying the CPU issuing the request for data and an address identifying a location in lower-level memory where the data is stored. Cache-control ID tags are then assigned to identify the locations in the request queue of the respective CPU-ID tags associated with each CPU request.
Cache-control requests consisting of the cache-control ID tags and the respective address information are then sent from the request queue to the lower-level memory or storage devices. For example, the cache-control requests can be sent by a cache control processor to a memory control unit for transmittal to various storage d
Hagiwara Yasuaki
Nguyen Le Trong
Robertson David L.
Seiko Epson Corporation
Sterne Kessler Goldstein & Fox PLLC
LandOfFree
Computer system for processing multiple requests and out of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer system for processing multiple requests and out of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer system for processing multiple requests and out of... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2523364