Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1999-05-12
2001-04-24
Ellis, Kevin L. (Department: 2185)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S213000
Reexamination Certificate
active
06223257
ABSTRACT:
BACKGROUND OF THE INVENTION
This invention relates generally to the use of cache memories as part of data processors, and, more specifically, to techniques of generating addresses to fetch instruction data from a cache memory.
Cache memories are used in data processors of various designs to improve the speed with which frequently used data is accessed. A single cache is often utilized for both instruction and user data but separate instruction and data caches are more commonly used in high performance processors. Cache memory is typically integrated with a microprocessor on a single chip. The limited capacity cache memory of a processor is loaded from a main system memory as necessary to make the frequently used data available for fast access by the processor. If data at a particular memory address specified by the processor is not in the cache, a significant number of processing cycles is required to obtain the data from the main memory and either write it into the cache or provide it directly to the processor, or both.
Addresses of instruction data are typically generated in a pipeline having at least two stages, one to calculate an address in one operating cycle and the next to apply that calculated address to the cache in the next operating cycle. Also during the second operating cycle, any data in the cache at that address is typically read out and written to an instruction buffer, and a status signal is returned to indicate whether data is present at that address or not, in terms of a “hit” or “miss.” If a miss, the cache accesses main memory to obtain the data at that address, typically resulting in a delay of many operating cycles before the data becomes available for writing into the instruction buffer. If a hit, it is desired to generate the next address as quickly as possible from the hit address plus the amount of data being returned from the cache at the hit address, preferably in the second operating cycle, in order to minimize the number of operating cycles required generate each address. However, it is difficult to resolve in the second cycle whether the current address has resulted in a hit or miss, in time to be used to generate the next address in the second cycle, resulting in either lengthening the duration of the cycles or waiting until the next cycle after the hit signal is returned in order to generate the next address. The performance of pipelined and other types of processors is adversely affected by such delays.
Therefore, it is a general object of the present invention to provide improved instruction fetch techniques that minimize the number of operating cycles required to address the cache and read instruction data from it.
It is a more specific object of the present invention to improve the speed at which instruction data at an address for which a miss signal is returned is accessed for use by the processor.
SUMMARY OF THE INVENTION
These and other objects are accomplished by the present invention, wherein, according to one aspect of the present invention, the individual addresses of instruction data are generated with the assumption that the full amount of data requested by prior address(es), but not yet returned, will be returned. If this prediction is correct, instruction data is fetched at a much faster rate than when it is first determined whether a particular address hits or misses before the next address is calculated. If incorrect, subsequent addresses calculated before the miss signal is returned from the cache are discarded and later recalculated but this penalty is no worse than in a system that always waits until a hit or miss signal is returned from one address before calculating the next address. The improved technique does not need to know whether a current address has resulted in a hit or not before the next address is calculated. So long as hits are being obtained, the new address is incremented by the amount of cache data that is read at one time, usually a full line. After a miss, however, in an architecture where the width of the bus to the main memory is less than the width of a line of cache data that is read at one time, each new address is preferably incremented for a time by the width of the main memory bus so that the instruction data missing from the cache is made available as soon as it is read from the main memory instead of waiting for a full line of missing cache data to be received.
According to another aspect of the present invention, in an architecture where the internal processor clock has a frequency that is higher than the frequency of the external clock, which is typical in high performance microprocessors, a missed cache address is subsequently regenerated in synchronism with the data first being made available from the main memory. It has been recognized that there are periodically recurring internal clock cycles where data from the main memory is first made available, either through the cache or directly from the main memory bypassing the cache, for writing into the instruction buffer. These internal clock cycles, referred to as “windows of opportunity,” occur once during each external clock cycle, immediately after data is latched onto the system memory bus. By synchronizing the retrieval of instruction data in this way, delays of one or more internal clock cycles to obtain instruction data from the main memory, typical of existing data fetch techniques without such synchronization, are avoided. The result is improved processor performance.
Additional objects, aspects, features and advantages of the present invention are included in the following description of its preferred embodiments, which description should be taken in conjunction with the accompanying drawings.
REFERENCES:
patent: 4943908 (1990-07-01), Emma et al.
patent: 5287487 (1994-02-01), Priem et al.
patent: 5379393 (1995-01-01), Yang
patent: 5499355 (1996-03-01), Krishnamohan et al.
patent: 5991848 (1999-11-01), Koh
patent: 6079002 (2000-06-01), Thatcher et al.
patent: 6085291 (2000-07-01), Hichs et al.
Tabak, D., “Chapter 4—Memory Hierarchy,”Advanced Microprocessors, Second Edition, pp. 43-65, (1995).
Cummins Sean P.
Munson Kenneth K.
Norrie Christopher I. W.
Ornes Matthew D.
Ellis Kevin L.
Oblon & Spivak, McClelland, Maier & Neustadt P.C.
Rise Technology Company
LandOfFree
Instruction cache address generation technique having... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Instruction cache address generation technique having..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Instruction cache address generation technique having... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2455074