Electrical computers and digital processing systems: memory – Storage accessing and control – Control technique
Reexamination Certificate
1997-12-29
2001-03-27
Ellis, Kevin L. (Department: 2751)
Electrical computers and digital processing systems: memory
Storage accessing and control
Control technique
C711S141000
Reexamination Certificate
active
06209068
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an improved read line buffer for cache systems of processor and to a communication protocol in support of such a read line buffer.
2. Related Art
In the electronic arts, processors are being integrated into multiprocessor designs with increasing frequency. A block diagram of such a system is illustrated in FIG.
1
. There, a plurality of agents
10
-
40
are provided in communication with each other over an external bus
50
. The agents may be processors, cache memories or input/output devices. Data is exchanged among the agents in a bus transaction.
A transaction is a set of bus activities related to a single bus request. For example, in the known Pentium Pro processor, commercially available from Intel Corporation, a transaction proceeds through six phases:
Arbitration, in which an agent becomes the bus owner,
Request, in which a request is made identifying an address,
Error, in which errors in the request phase are identified,
Snoop, in which cache coherency checks are made,
Response, in which the failure or success of the transaction is indicated, and
Data, in which data may be transferred.
Other processors may support transactions in other ways.
In multiple agent systems, the external bus
50
may be a pipelined bus. In a pipelined bus, several transactions may progress simultaneously provided the transactions are in mutually different phases. Thus, a first transaction may be started at the arbitration phase while a snoop response of a second transaction is being generated and data is transferred according to a third transaction. However, a given transaction generally does not “pass” another in the pipeline.
Cache coherency is an important feature of a multiple agent system. If an agent is to operate on data, it must confirm that the data it will read is the most current copy of the data that is available. In such multiple agent systems, several agents may operate on data from a single address. Oftentimes when a first agent
10
desires to operate on data at an address, a second agent
30
may have cached a copy of the data that is more current than the copy resident in an external cache. The first agent
10
should read the data from the second agent
10
rather than from the external cache
40
. Without a means to coordinate among agents, an agent
10
may perform a data operation on stale data.
In a snoop phase, the agents coordinate to maintain cache coherency. In the snoop phase, each of the other agents
20
-
40
reports whether it possesses a copy of the data or whether it possesses a modified (“dirty”) copy of the data at the requested address. In the Pentium Pro, an agent indicates that it possesses a copy of the data by asserting a HIT# pin in a snoop response. It indicates that it possesses a dirty copy of the requested data by asserting a HITM# pin. If dirty data exists, it is more current than the copy in memory. Thus, dirty data will be read by an agent
10
from the agent
20
possessing the dirty copy. Non-dirty data is read by an agent
10
from memory. Only an agent that possesses a copy of data at the requested address drives a snoop response; if an agent does not possess such a copy, it generates no response.
A snoop response is expected from all agents
10
-
40
within a predetermined period of time. Occasionally, an agent
30
cannot respond to another agent's request before the period closes. When this occurs, the agent
30
may generate a “snoop stall response” that indicates that the requesting agent
10
must wait beyond the period for snoop results. In the Pentium Pro processor, the snoop stall signal occurs when an agent
30
toggles outputs HIT# and HITM# from high to low in unison.
FIG. 2
illustrates components of a bus sequencing unit (“BSU”)
100
and a core
200
within a processor
10
as are known in the art. The BSU
100
manages transaction requests generated within the processor
10
and interfaces the processor
10
to the external bus
50
. The core
200
executes micro operations (“UOPs”), such as the processing operations that are required to execute software programs.
The BSU
100
is populated by a bus sequencing queue
140
(“BSQ”), an external bus controller
150
(“EBC”), a read line buffer
160
and a snoop queue
170
. The BSQ
140
processes requests generated within the processor
10
that must be referred to the external bus
50
for completion. The EBC
150
drives the bus to implement requests. It also monitors transactions initiated by other agents on the external bus
50
. The snoop queue
170
monitors snoop requests made on the external bus
50
, polls various components within processor
10
regarding the snoop request and generates snoop results therefrom. The snoop results indicate whether the responding agent possesses non-dirty data, dirty data or is snoop stalling. Responsive to the snoop results, the EBC
150
asserts the result or the external bus.
As noted, the BSQ
140
, monitors requests generated from within the processor
10
to be referred to the external bus
50
for execution. An example of one such request is a read of data from external memory to the core
200
. “Data” may represent either an instruction to be executed by the core or variable data representing data input to such an instruction. The BSQ
140
passes the request to the EBC
150
to begin a transaction on the external bus
50
. The BSQ
140
includes a buffer memory
142
that stores the requests tracked by the BSQ
140
. The number of registers
142
a-h
in memory
142
determines how many transactions the BSQ
140
may track simultaneously.
The EBC
150
tracks activity on the external bus
50
. It includes a pin controller
152
that may drive data on the external bus
50
. It includes an in-order queue
154
that stores data that is asserted on the bus at certain events. For example, snoop results to be asserted on the bus during a snoop phase may be stored in the in-order queue
154
. The EBC
150
interfaces with the snoop queue
170
and BSQ
140
to accumulate data to be asserted on the external bus
50
.
During the data phase of a transaction, data is read from the external bus
50
into the read line buffer
160
. The read line buffer
160
is an intermediate storage buffer, having a memory
162
populated by its own number of registers
162
a-h.
The read line buffer
160
provides for storage of data read from the external bus
50
. The read line buffer
160
stores the data only temporarily; it is routed to another destination such as a cache
180
in the BSU
100
, a data cache
210
in the core or an instruction cache
220
in the core. Data read into a read line buffer storage entry
162
a
is cleared when its destination becomes available.
There is a one-to-one correspondence between read line buffer entries
162
a-h
and BSQ buffer entries
140
a-h.
Thus, data from a request buffered in BSQ entry
142
a
will be read into buffer entry
162
a.
For each request buffered in BSQ buffer
142
, data associated with the request is buffered in the buffer memory
162
in the read line buffer
162
.
The one to one correspondence between the depth of the BSQ buffer
142
and the read line buffer
160
is inefficient. Read line buffer utilization is very low. The read line buffer
160
operates at a data rate associated with the BSU
100
and the core
200
which is much higher than a data rate of the external bus
50
. Thus, data is likely to be read out of the read line buffer
160
faster than the bus
50
can provide data to it. The one to one correspondence of BSQ buffer entries to the read line buffer entries is unnecessary. Also, the read line buffer storage entries
162
a-h
consume a significant amount of area when the processor is fabricated as an integrated circuit.
It is desired to increase the depth of buffers in the BSQ
140
. In the future, latency between the request phase and the data phase of transactions on the external bus
50
is expected to increase. External buses
50
will become more pipelined
Bachand Derek T.
Fisch Matthew A.
Hill David L.
Prudvi Chinna
Ellis Kevin L.
Intel Corporation
Kenyon & Kenyon
LandOfFree
Read line buffer and signaling protocol for processor does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Read line buffer and signaling protocol for processor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Read line buffer and signaling protocol for processor will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2545121