Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
1998-12-31
2002-07-30
Kim, Matthew (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S154000, C711S168000, C711S169000
Reexamination Certificate
active
06427191
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to the field of electronic data processing devices. More particularly, the present invention relates to microprocessor on-chip cache memories.
BACKGROUND OF THE INVENTION
Many computer systems today use cache memories to improve the speed of access to more frequently used data and instructions. A small cache memory may be integrated on a microprocessor chip itself, thus, greatly improving the speed of access by eliminating the need to go outside the microprocessor chip to access data or instructions from an external memory.
During a normal data accessing routine, the microprocessor will first look to an on-chip cache memory to see if the desired data or instructions are resident there. If they are not, the microprocessor will then look to one or more off-chip memories. On-chip memory, or cache memory, is smaller than main memory. Multiple main memory locations may be mapped into the cache memory. The main memory locations, or addresses, which represent the most frequently used data and instructions get mapped into the cache memory. Cache memory entries must contain not only data, but also enough information (“tag address and status” bits) about the address associated with the data in order to effectively communicate which external, or main memory, addresses have been mapped into the cache memory. To improve the percentage of finding the memory address in the cache (the cache “hit ratio”) it is desirable for cache memories to be set associative, e.g., a particular location in memory may be stored in multiple ways in the cache memory.
Most previous cache designs, because of their low frequency, can afford a relatively large cache, e.g. a cache which contains both integer data and larger floating point data. In lower frequency microprocessors, a relatively large cache could still have an access latency of a single clock cycle. However, as microprocessor frequencies and instruction issue width increases the cache access latency can become greater than two clock cycles.
One approach to improving the performance of an on-chip cache includes dual porting and pipelining the cache. Previous cache designs which are dual-ported and pipelined have complex, and costly, self-timed circuits to correctly align memory and tag array access. The addition of self-timed circuits, expends valuable processor space which could otherwise be used for a larger cache capacity. Moreover, complex control schemes are used in these designs since distinct clock cycles are not allocated to the separate cache functions of “cache lookup” and “data manipulation.”
For the reasons stated above, and for other reasons stated below which will become apparent to those skilled in the art upon reading and understanding the present specification, it is desirable to develop improved performance for cache memory.
SUMMARY OF THE INVENTION
The present invention includes a novel cache design that allows two cache requests to be processed simultaneously (dual-ported) and concurrent cache requests to be in-flight (pipelined). The cache design includes a first cache memory stage adapted for cache data access. At least two address ports are coupled to the first cache memory stage. Each address port is adapted to provide an input for a cache address on a first clock cycle of a processor clock signal. The cache design includes a second cache memory stage adapted for cache data manipulation. The second cache memory stage is adapted to receive cache data corresponding to cache data address found in the first cache memory stage in a second clock cycle of the processor clock signal. Thus, the design of the cache allocates the first clock cycle to cache tag and data access and the second clock cycle is allocated to data manipulation.
In an alternative embodiment, a method for accessing a cache memory is provided. The method includes receiving a first cache address into a first cache memory stage at a first address port in a first clock cycle. A second cache address is received into the first cache memory stage at a second address port in the first clock cycle. A first data set corresponding to the first cache address is provided to a second cache memory stage in a second clock cycle. The method further includes providing a second data set corresponding to the second cache address to the second cache memory stage in the second clock cycle.
REFERENCES:
patent: 5091845 (1992-02-01), Rubinfeld
patent: 5434989 (1995-07-01), Yamaguchi
patent: 5561781 (1996-10-01), Braceras et al.
patent: 5640534 (1997-06-01), Liu et al.
patent: 5675765 (1997-10-01), Malamy et al.
patent: 5752269 (1998-05-01), Divivier et al.
patent: 5878245 (1999-03-01), Johnson et al.
Cheong Fu John Wai
Mathews Gregory S.
Mulla Dean A.
Intel Corporation
Peugh B. R.
Schwegman Lundberg Woessner & Kluth P.A.
LandOfFree
High performance fully dual-ported, pipelined cache design does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with High performance fully dual-ported, pipelined cache design, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High performance fully dual-ported, pipelined cache design will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2828572