Scalar hardware for performing SIMD operations

Electrical computers and digital processing systems: processing – Processing control – Arithmetic operation instruction processing

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C708S501000

Reexamination Certificate

active

06292886

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates to systems for processing data and, in particular, to systems for processing data through single-instruction multiple data (SIMD) operations.
2. Background Art
Processor designers are always looking for ways to enhance the performance of microprocessors. Processing multiple operands in parallel provides one avenue for gaining additional performance from today's highly optimized processors. In certain common mathematical calculations and graphics operations, the same operation(s) is performed repeatedly on each of a large number of operands. For example, in matrix multiplication, the row elements of a first matrix are multiplied by corresponding column elements of a second matrix and the resulting products are summed (multiply-accumulate). By providing appropriate scheduling and execution resources, multiply-accumulate operations may be implemented concurrently on multiple sets of row-column operands. This approach is known as vector processing or single instruction, multiple data stream (SIMD) processing to distinguish it from scalar or single instruction, single data stream (SISD) processing.
In order to implement SIMD operations efficiently, data is typically provided to the execution resources in a “packed” data format. For example, a 64-bit processor may operate on a packed data block, which includes two 32-bit operands. In this example, a vector FMAC instruction, FMAC (f
1
, f
2
, f
3
), multiplies each of a pair of 32-bit operands stored in register f
1
with a corresponding pair of 32-bit entries stored in register f
2
and adds the resulting products to a pair of running sums stored in register f
3
. In other words, data is stored in the registers f
1
, f
2
, and f
3
in a packed format that provides two operands from each register entry. If the processor has sufficient resources, it may process two or more packed data blocks, e.g. four or more 32-bit operands, concurrently. The 32 bit operands are routed to different execution units for processing in parallel and subsequently repacked, if necessary.
Even in graphics-intensive and scientific programming, not all operations are SIMD operations. Much of the software executed by general-purpose processors comprises instructions that perform scalar operations. That is, each source register specified by an instruction stores one operand and each target register specified by the instruction receives one operand. In the above example, a scalar floating point mulitply-accumulate instruction, S-FMA (f
1
, f
2
, f
3
), may multiply a single 64-bit operand stored in register f
1
with corresponding 64-bit operand stored in register f
2
and add the product to a running sum stored in register f
3
. Each operand processed by S-FMA instruction is provided to the FMAC unit in an unpacked format.
The register files that provide source operands to and receive results from execution units consume significant amounts of a processor's die area. Available die area is a scarce resource on most processor chips. For this reason, processors typically include one register file for each major data type. For example, a processor typically has one floating point register file that stores both packed and unpacked floating point operands. Consequently, packed and unpacked operands are designed to fit in the same sized register entries, despite the fact that a packed operand includes two or more component operands.
Providing execution resources for packed and unpacked operands creates performance/cost challenges. One way to provide high performance scalar and vector processing is to include separate scalar and vector execution units. An advantage of this approach is that the vector and scalar execution units can each be optimized to process data in its corresponding format, i.e. packed and unpacked, respectively. The problem with this approach is that the additional execution units consume silicon die area, which is a relatively precious commodity.
The present invention addresses these and other problems with currently available SIMD systems.
SUMMARY OF THE INVENTION
A system is provided that supports processing of component operands from a packed operand on a scalar execution resource, without significantly reducing the performance of the scalar execution resource on unpacked operands.
In accordance with the present invention, a system includes an operand delivery module and a scalar execution unit. The operand delivery module identifies a packed operand and converts a component operand of the packed operands for processing by the scalar execution unit.
For one embodiment of the invention, the operand delivery system may provide component operands to multiple scalar execution units or to a combination of scalar and vector execution units to implement SIMD operations. For example, a scalar FMAC may operate in conjunction with a vector FMAC, designed to process component operands from a packed operand, to implement a vector FMAC instruction (V-FMA).


REFERENCES:
patent: 4595911 (1986-06-01), Kregness et al.
patent: 5063497 (1991-11-01), Cutler et al.
patent: 5278945 (1994-01-01), Basehore et al.
patent: 5450607 (1995-09-01), Kowalczyk et al.
patent: 5751987 (1998-05-01), Mahant-Shetti et al.
patent: 5761103 (1998-06-01), Oakland et al.
patent: 5801975 (1998-09-01), Thayer et al.
patent: 5880984 (1999-03-01), Burchfiel et al.
patent: 6131104 (2000-10-01), Oberman

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Scalar hardware for performing SIMD operations does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Scalar hardware for performing SIMD operations, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Scalar hardware for performing SIMD operations will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2452457

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.