Electrical computers and digital processing systems: processing – Instruction alignment
Reexamination Certificate
1998-12-10
2001-09-18
Donaghue, Larry D. (Department: 2154)
Electrical computers and digital processing systems: processing
Instruction alignment
C712S023000, C712S215000
Reexamination Certificate
active
06292882
ABSTRACT:
FIELD OF THE INVENTION
The invention relates in general to the field of digital systems and, more particularly, to filtering within digital systems. Specifically, the invention relates to a method and apparatus to filter instructions within a digital system.
DESCRIPTION OF THE RELATED ART
With the growing complexity of modern computer systems, designers are constantly seeking more efficient methods to increase the speed at which instructions are executed within a computer system. Many modern computer systems utilize parallel processing, which allows the simultaneous execution of numerous instructions. An instruction (e.g., a macro-instruction) may be decoded into one or more simple micro-operations (e.g., micro-instructions) that are sometimes referred to as “&mgr;ops” or “uops”. It is these micro-operations that may be simultaneously executed in a parallel processor.
FIG. 1
illustrates a portion of a processor
100
with a front end
105
that receives instructions, e.g., macro-instructions. The instructions are sent to a conversion stage
115
via the path
108
. The conversion stage
115
may be designed to convert the macro-instructions on the line
108
to simple micro-operations that are sent to the back end
110
via the path
109
. The conversion stage
115
may include an expansion circuit
120
, a decoded number generator
124
, and a buffer
122
, as well as other components to provide the desired conversion. When an instruction (e.g., a macro-instruction) enters the conversion stage
115
, the instruction is generally sent to a stage where the instruction is identified in a table. For each instruction within this table, there may be several corresponding complex micro-operations, also referred to as “cuops”.
Typically, one macro-instruction may generate up to three complex micro-operations, two of which may be further expanded. Generally, the complex micro-operations are sent to the expansion circuit
120
where one or more simple or expanded micro-operations “euops” are generated from each of the complex micro-operations. For example, the expansion circuit
120
may generate as many as four euops from one complex micro-operation. Often, the other complex micro-operation translates directly into an euop.
Still referring to
FIG. 1
, these euops may be stored in a buffer
122
until needed by the other components. The decoded number generator
124
enables the euops to be manipulated faster by the other components by converting information related to the euops, such as their associated addresses, to decoded numbers. Generally, a decoded number is a converted binary number comprising all zeros and only one “1”. For example, the 8-bit binary number 00000011 (i.e., decimal number 3) corresponds to decoded number 00001000. Generally, in a decoded number, a one in the least significant bit represents decimal 0, a one in the next bit represents decimal 1, a one in the next bit represents decimal 2, a one in the next bit represents decimal 3, and so on. Therefore, as the decimal number increases, the number of bits required to represent the decoded number increases.
For example, the binary number 0011 may correspond to the address for an euop to be stored in the buffer
122
. The binary number 0011 would be converted to the decoded number 1000 after traversing the decoded number generator
124
. The euop is written to the buffer address corresponding to the decoded address which, here, is 1000.
FIG. 2
is an enlarged diagram of the expansion circuit
120
and the buffer
122
. The expansion circuit
120
includes two individual expanders
200
,
210
that are designed to receive micro-operations including cuops along the paths
220
,
224
, respectively. One skilled in the art will appreciate that each path
220
,
224
may consist of multiple lines. The expanders
200
,
220
are configured to respectively apply a plurality of euops to the lines
201
-
205
,
206
-
209
, respectively, in response to expanding the micro-operations applied to the paths
220
,
224
, respectively.
Typically, a logic circuit is placed between the expansion circuit
120
and the buffer
122
. Here, the logic circuit is placed between the decoded number generator
124
and the buffer
122
. The logic circuit provides the addressing desired in order for the euops applied to the lines
201
-
209
to be properly stored in the buffer
122
. This logic circuit is shown in
FIG. 2
as a buffer address circuit
230
, which receives and routes euops to a particular location in the buffer
122
via the lines
231
-
239
. The buffer address circuit
230
receives decoded addresses from the decoded number generator
124
and routes the corresponding euops to the entry in the buffer
122
indicated by the decoded address. Typically, buffers utilize head pointers that indicate the starting position (e.g., the cell) where the next set of data should be written. When data is written (i.e., stored), it normally begins at the cell indicated by the head pointer and proceeds down sequentially (e.g., head pointer plus one, head pointer plus two, etc.). The buffer address circuit
230
receives the head pointer along the line
240
and uses this information to assign cells in which to store the euops.
Still referring to
FIG. 2
, if the micro-operations applied to the paths
220
,
224
expand to five and four euops, respectively, each of the cells
251
-
259
would contain valid data. Therefore, when the cells
251
-
259
are read, each of them would include valid data corresponding to the expanded micro-operations (euops) received on paths
201
-
209
.
If, however, the micro-operation received on the line
220
is expanded into only two simplified micro-operations (euops) that are applied to the lines
201
-
202
, the lines
203
-
205
will remain unused. Assuming that everything else is identical to the previous example, the cells
251
-
252
,
256
-
259
would contain the latest set of data. Because the lines
203
,
204
, and
205
were not used (since the corresponding micro-operations could not be expanded) the cells
253
,
254
and
255
would contain invalid data. This invalid data is hidden in between the valid data found in the cells
251
-
252
,
256
-
259
. Therefore, when the values in cells
251
-
259
are read out, invalid data in the cells
253
,
254
and
255
is sent to other portions of the system along with the correct data. The transmission of invalid data may cause other stages in the computer system to malfunction. In addition, these holes (ie., non-utilized portions of the buffer) may also be problematic. Larger buffers may be needed in order to account for the resulting holes. This increases the cost and decreases the performance of the system. Each of these problems may be intensified as the number of non-utilized cells increases. Thus,
FIG. 2
shows what would happen if valid and invalid micro-operations were written to consecutive address locations in the buffer
122
.
Some prior methods filtered the invalid data from the valid data prior to writing the valid data to the buffer. This eliminated the holes in the buffer but required extra logic, and thus extra chip space, to accomplish this result.
The prior method placed valid data in consecutive entries in the buffer, starting from the head pointer location, by using datapath muxing. For example, in a system producing nine uops per clock cycle, some valid, some invalid, a 9:1 mux could be used to select the first valid uop in the input set of uops
1
-
9
. A second 8:1 mux could be used to select the second valid uop. A third 7:1 mux could be used to select the third valid uop. A fourth 6:1 mux could be used to select the fourth valid uop. A fifth 5:1 mux could be used to select the fifth valid uop. A sixth 4:1 mux could be used to select the sixth valid uop. A seventh 3:1 mux could be used to select the seventh valid uop. An eighth 2:1 mux could be used to select the eighth valid uop, and the ninth uop could be sent directly to a logic circuit to determine whether the ninth uop was valid.
Datapath muxing, however, presents problem
Khan Umair A.
Zaidi Nazar A.
Blakely , Sokoloff, Taylor & Zafman LLP
Donaghue Larry D.
Intel Corporation
LandOfFree
Method and apparatus for filtering valid information for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for filtering valid information for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for filtering valid information for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2511996