Electrical computers and digital processing systems: processing – Processing control – Arithmetic operation instruction processing
Reexamination Certificate
1998-03-31
2001-02-20
Treat, William M. (Department: 2783)
Electrical computers and digital processing systems: processing
Processing control
Arithmetic operation instruction processing
Reexamination Certificate
active
06192467
ABSTRACT:
FIELD OF THE INVENTION
The invention relates generally to the field of computer systems. More particularly, the invention relates to a method and apparatus for efficiently executing partial-width packed data instructions, such as scalar packed data instructions, by a processor that makes use of SIMD technology, for example.
BACKGROUND OF THE INVENTION
Multimedia applications such as 2D/3D graphics, image processing, video compression/decompression, voice recognition algorithms and audio manipulation, often require the same operation to be performed on a large number of data items (referred to as “data parallelism”). Each type of multimedia application typically implements one or more algorithms requiring a number of floating point or integer operations, such as ADD or MULTIPLY (hereafter MUL). By providing macro instructions whose execution causes a processor to perform the same operation on multiple data items in parallel, Single Instruction Multiple Data (SIMD) technology, such as that employed by the Pentium® processor architecture and the MMx™ instruction set, has enabled a significant improvement in multimedia application performance (Pentium® and MMx™ are registered trademarks or trademarks of Intel Corporation of Santa Clara, Calif.).
SIMD technology is especially suited to systems that provide packed data formats. A packed data format is one in which the bits in a register are logically divided into a number of fixed-sized data elements, each of which represents a separate value. For example, a 64-bit register may be broken into four 16-bit elements, each of which represents a separate 16-bit value. Packed data instructions may then separately manipulate each element in these packed data types in parallel.
Referring to
FIG. 1
, an exemplary packed data instruction is illustrated. In this example, a packed ADD instruction (e.g., a SIMD ADD) adds corresponding data elements of a first packed data operand, X, and a second packed data operand, Y, to produce a packed data result, Z, i.e., X
0
+Y
0
=Z
0
, X
1
+Y
1
=Z
1
, X
2
+Y
2
=Z
2
, and X
3
+Y
3
=Z
3
. Packing many data elements within one register or memory location and employing parallel hardware execution allows SIMD architectures to perform multiple operations at a time, resulting in significant performance improvement. For instance, in this example, four individual results may be obtained in the time previously required to obtain a single result.
While the advantages achieved by SIMD architectures are evident, there remain situations in which it is desirable to return individual results for only a subset of the packed data elements.
SUMMARY OF THE INVENTION
A method and apparatus are described for executing partial-width packed data instructions. According to one aspect of the invention, a processor includes a plurality of registers, a register renaming unit coupled to the plurality of registers, a decoder coupled to the register renaming unit, and a partial-width execution unit coupled to the decoder. The register renaming unit provides an architectural register file to store packed data operands each of which include a plurality of data elements. The decoder is configured to decode a first and second set of instructions that each specify one or more registers in the architectural register file. Each of the instructions in the first set of instructions specify operations to be performed on all of the data elements stored in the one or more specified registers. In contrast, each of the instructions in the second set of instructions specify operations to be performed on only a subset of the data element stored in the one or more specified registers. The partial-width execution unit is configured to execute operations specified by either of the first or the second set of instructions.
Other features and advantages of the invention will be apparent from the accompanying drawings and from the detailed description.
REFERENCES:
patent: 3675001 (1972-07-01), Singh
patent: 3723715 (1973-03-01), Chen et al.
patent: 4890218 (1989-12-01), Bram
patent: 5210711 (1993-05-01), Rossmere et al.
patent: 5311508 (1994-05-01), Buda et al.
patent: 5426598 (1995-06-01), Hagihara
patent: 5673427 (1997-09-01), Brown et al.
patent: 5721892 (1998-02-01), Peleg et al.
patent: 5793661 (1998-08-01), Dulong et al.
patent: 5852726 (1998-12-01), Lin et al.
patent: 5936872 (1999-08-01), Fischer et al.
patent: 6018351 (2000-01-01), Mennemeier et al.
patent: 6041403 (2000-05-01), Parker et al.
patent: 9907221 (1999-10-01), None
patent: WO 97/08608 (1997-03-01), None
patent: WO 97/22921 (1997-06-01), None
patent: WO 97/22924 (1997-06-01), None
patent: WO 97/22923 (1997-06-01), None
Abbott et al., “Broadband Algorithms with the MicroUnity Mediaprocessor”, Proceedings of COMPCON '96, 1996, pp. 349-354.
Hayes et al., “MicroUnity Software Development Environment”, Proceedings of COMPCON '96, 1996, pp. 341-348.
International Search Report PCT/US99/04718, Jun. 28, 1999, 4 pages.
“TM 1000 Preliminary Data Book”, Philips Semiconductors, 1997.
“21164 Alpha™ Microprocessor Data Sheet”, Samsung Electronics, 1997.
“Silicon Graphics Introduces Enhanced MIPS® Architecture to lead the Interactive Digital Revolution, Silicon Graphics”, Oct. 21, 1996, donwloaded from Website webmaster@www.sgi.com, pp. 1-2.
“Silicon Graphics Introduces Compact MIPS® RISC Microprocessor Code for High Performance at a low Cost”, Oct. 21, 1996, donwloaded from Website webmaster@www.sgi.com, pp. 1-2.
Killian, Earl, “MIPS Ext ension for Digital Media”, Silicon Graphics, pp. 1-10.
“MIPS V Instruction Set”, pp. B1- to B-37.
“MIPS Digital Media Extension”, pp. C1 to C40.
“MIPS Extension for Digital Media with 3D”, MIPS Technologies, Inc., Mar. 12, 1997, pp. 1-26.
“64-Bit and Multimedia Extensions in the PA-RISC 2.0 Architecture”, Helett Packard, donwloaded from Website rblee@cup.hp.com.huck@cup.hp.com,pp. 1-18.
“The VIS™ Instruction Set”, Sun Microsystems, Inc., 1997, pp. 1-2.
“ULTRASPARC™ The Visual Instruction Set (VIS™): On Chip Support for New-Media Processing”, Sun Microsystems, Inc., 1996, pp. 1-7.
ULTRASPARC™ and New Media Support Real-Time MPEG2 Decode with the Visual Instruction Set (VIS™), Sun Microsystems, Inc., 1996, pp. 1-8.
ULTRASPARC™ Ultra Port Architecture (UPA): The New-Media System Architecture, Sun Microsystems, Inc., 1996, pp. 1-4.
ULTRASPARC™ Turbocharges Network Operations on New Media Computing, Sun Microsystem, Inc., 1996, pp. 1-5.
The UltraSPARC Processor—Technology White Paper, Sun Microsystems, Inc., 1995, 37 pages.
AMD-3D™ Technology Manual, Advanced Micro Devices, Feb. 1998.
Hansen, Craig, Architecture of a Broadband Mediaprocessor, MicroUnity Systems Engineering, Inc., 1996, pp. 334-354.
Levinthal, Adam, et al., Parallel Computers for Graphics Applications, Pixar, San Rafael, CA, 1987, pp. 193-198.
Levinthal, Adam; Porter, Thomas, “Chap-A SIMD Graphics Processor”, Computer Grahics Project, Lucasfilm Ltd., 1984, pp. 77-82.
Wang, Mangaser, Shrinivan, A processor Architecture for 3D Graphics Calculations, Computer Motion, Inc., pp. 1-23.
Visual Instruction Set (VIS™), User's Guide, Sun Microsystems, Inc., version 1.1 Mar., 1997.
Abdallah Mohammad A.
Coke James
Pentkovski Vladimir
Blakely , Sokoloff, Taylor & Zafman LLP
Intel Corporation
Treat William M.
LandOfFree
Executing partial-width packed data instructions does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Executing partial-width packed data instructions, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Executing partial-width packed data instructions will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2614308