Method and system for early tag accesses for lower-level...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and system for early tag accesses for lower-level... Method and system for early tag accesses for lower-level...

: 2000-02-09
: 2002-07-30
: Robertson, David L. (Department: 2187)
: Electrical computers and digital processing systems: memory
: Storage accessing and control
: Hierarchical memories

: Reexamination Certificate
: active
: 06427188
: ABSTRACT:

BACKGROUND
Prior art cache designs for processors typically implement one or two level caches. More recently, multi-level caches having three or more levels have been designed in the prior art. Of course, it is desirable to have the cache implemented in a manner that allows the processor to access the cache in an efficient manner. That is, it is desirable to have the cache implemented in a manner such that the processor is capable of accessing the cache (i.e., reading from or writing to the cache) quickly so that the processor may be capable of executing instructions quickly and so that dependent instructions can receive data from cache as soon as possible.
An example of a prior art, multi-level cache design is shown in FIG.
1
. The exemplary cache design of
FIG. 1
has a three-level cache hierarchy, with the first level referred to as L
0
, the second level referred to as L
1
, and the third level referred to as L
2
. Accordingly, as used herein L
0
refers to the first-level cache, L
1
refers to the second-level cache, L
2
refers to the third-level cache, and so on. It should be understood that prior art implementations of multi-level cache design may include more than three levels of cache, and prior art implementations having any number of cache levels are typically implemented in a serial manner as illustrated in FIG.
1
. As discussed more fully hereafter, multi-level caches of the prior art are generally designed such that a processor accesses each level of cache in series until the desired address is found. For example, when an instruction requires access to an address, the processor typically accesses the first-level cache L
0
to try to satisfy the address request (i.e., to try to locate the desired address). If the address is not found in L
0
, the processor then accesses the second-level cache L
1
to try to satisfy the address request. If the address is not found in L
1
, the processor proceeds to access each successive level of cache in a serial manner until the requested address is found, and if the requested address is not found in any of the cache levels, the processor then sends a request to the system's main memory to try to satisfy the request.
Typically, when an instruction requires access to a particular address, a virtual address is provided from the processor to the cache system. As is well-known in the art, such virtual address typically contains an index field and a virtual page number field. The virtual address is input into a translation look-aside buffer (“TLB”)
10
for the L
0
cache. The TLB
10
provides a translation from a virtual address to a physical address. The virtual address index field is input into the L
0
tag memory array(s)
12
. As shown in
FIG. 1
, the L
0
tag memory array
12
may be duplicated N times within the L
0
cache for N “ways” of associativity. As used herein, the term “way” refers to a partition of the lower-level cache. For example, the lower-level cache of a system may be partitioned into any number of ways. Lower-level caches are commonly partitioned into four ways. As shown in
FIG. 1
, the virtual address index is also input into the L
0
data array structure(s) (or “memory structure(s)”)
14
, which may also be duplicated N times for N ways of associativity. The L
0
data array structure(s)
14
comprise the data stored within the L
0
cache, which may be partitioned into several ways.
The L
0
tag
12
outputs a physical address for each of the ways of associativity. That physical address is compared with the physical address output by the L
0
TLB
10
. These addresses are compared in compare circuit(s)
16
, which may also be duplicated N times for N ways of associativity. The compare circuit(s)
16
generate a “hit” signal that indicates whether a match is made between the physical addresses. As used herein, a “hit” means that the data associated with the address being requested by an instruction is contained within a particular cache. As an example, suppose an instruction requests an address for a particular data labeled “A.” The data label “A” would be contained within the tag (e.g., the L
0
tag
12
) for the particular cache (e.g., the L
0
cache), if any, that contains that particular data. That is, the tag for a cache level, such as the L
0
tag
12
, represents the data that is residing in the data array for that cache level. Therefore, the compare circuitry, such as compare circuitry
16
, basically determines whether the incoming request for data “A” matches the tag information contained within a particular cache level's tag (e.g., the L
0
tag
12
). If a match is made, indicating that the particular cache level contains the data labeled “A,” then a hit is achieved for that particular cache level.
Typically, the compare circuit(s)
16
generate a single signal for each of the ways, resulting in N signals for N ways of associativity, wherein such signal indicates whether a hit was achieved for each way. The hit signals (i.e., “L
0
way hits”) are used to select the data from the L
0
data array(s)
14
, typically through multiplexer (“MUX”)
18
. As a result, MUX
18
provides the cache data from the L
0
cache if a way hit is found in the L
0
tags. If the signals generated from the compare circuitry
16
are all zeros, meaning that they are no hits, then “miss” logic
20
is used to generate a L
0
cache miss signal. Such L
0
cache miss signal then triggers control to send the memory instruction to the L
1
instruction queue
22
, which queues (or holds) memory instructions that are waiting to access the L
1
cache. Accordingly, if it is determined that the desired address is not contained within the L
0
cache, a request for the desired address is then made in a serial fashion to the L
1
cache.
In turn, the L
1
instruction queue
22
feeds the physical address index field for the desired address into the L
1
tag(s)
24
, which may be duplicated N times for N ways of associativity. The physical address index is also input to the L
1
data array(s)
26
, which may also be duplicated N times for N ways of associativity. The L
1
tag(s)
24
output a physical address for each of the ways of associativity to the L
1
compare circuit(s)
28
. The L
1
compare circuit(s)
28
compare the physical address output by L
1
tag(s)
24
with the physical address output by the L
1
instruction queue
22
. The L
1
compare circuit(s)
28
generate an L
1
hit signal(s) for each of the ways of associativity indicating whether a match between the physical addresses was made for any of the ways of L
1
. Such L
1
hit signals are used to select the data from the L
1
data array(s)
26
utilizing MUX
30
. That is, based on the L
1
hit signals input to MUX
30
, MUX
30
outputs the appropriate L
1
cache data from L
1
data array(s)
26
if a hit was found in the L
1
tag(s)
24
. If the L
1
way hits generated from the L
1
compare circuitry
28
are all zeros, indicating that there was no hit generated in the L
1
cache, then a miss signal is generated from the “miss” logic
32
. Such a L
1
cache miss signal generates a request for the desired address to the L
2
cache structure
34
, which is typically implemented in a similar fashion as discussed above for the L
1
cache. Accordingly, if it is determined that the desired address is not contained within the L
1
cache, a request for the desired address is then made in a serial fashion to the L
2
cache. In the prior art, additional levels of hierarchy may be added after the L
2
cache, as desired, in a similar manner as discussed above for levels L
0
through L
2
(i.e., in a manner such that the processor accesses each level of the cache in series, until an address is found in one of the levels of cache). Finally, if a hit is not achieved in the last level of cache (e.g., L
2
of FIG.
1
), then the memory request is sent to the processor system bus to access the main memory of the system.
Multi-level cache designs of the prior art are problematic in that such designs require each level of cache to be accessed in series until a “hit” is a

Affiliated with

DeLano Eric R

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lyon Terry L

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mulla Dean A.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Hewlett--Packard Company

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Robertson David L.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for early tag accesses for lower-level... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for early tag accesses for lower-level..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for early tag accesses for lower-level... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2872433

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure