Method and apparatus for multiplying and accumulating...

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for multiplying and accumulating... Method and apparatus for multiplying and accumulating...

: 2002-08-02
: 2004-11-23
: Mai, Tan V. (Department: 2124)
: Electrical computers: arithmetic processing and calculating
: Electrical digital calculating computer
: Particular function performed

: Reexamination Certificate
: active
: 06823353
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to the field of computer systems. More specifically, the invention relates to operations on complex numbers.
2. Background Information
Many devices in use today (e.g., modems, radar, TV, telephone, etc.) transmit data using in phase and out of phase signals (e.g., orthogonal signals). This data is typically processed using complex numbers (e.g., the real number is used for the in phase signal, while the imaginary number is used for the out of phase signal). The multiplication of two complex number (e.g., r
1
i
1
and r
2
i
2
) is performed according to Equation 1 shown below:
Real Component=
r
1
·r
2
−i
1
·i
2
Imaginary Component=
r
1
·i
2
+r
2
·i
1
Equation 1
The multiplication of complex numbers is required in operations such as, the multiply-accumulate operation (see Equation 2 below). In Equation 2, a(n) and b(n) represent the n
th
complex numbers in two series of complex numbers:
y
(
n
)=
y
(
n
−1)+
a
(
n
)*
b
(
n
) Equation 2
Digital discrete time filters, such as a FIR filter and an IIR filter, require many multiply-accumulate operations. A FIR filter is an operation which is used in applications, such as real time digital signal processing applications (e.g., complex demodulation and equalization found in high speed data modems; ghost canceling in terrestrial broadcasting), for recovery of the transmitted information from the signal. The equation for the FIR filter is shown below as Equation 3:
y
⁡
(
k
)
=
∑
n
=
0
L
-
1
⁢
c
⁡
(
n
)
*
x
⁡
(
k
-
n
)
Equation
⁢

⁢
3
With reference to Equation 3, the complex variable y(k) represents the current output sample of the filter, the input value c(n) represents the n
th
filter coefficient of the filter, the constant L is the number of coefficients in c(n), and the input value x(k−n) represents the n
th
past value of the input sequence (also termed as “samples”). The output of the filter is a weighted average of the past L complex samples. Typically, there are more samples than there are coefficients. For the computation of the k
th
output sample y(k), the first complex coefficient corresponds to the k
th
sample, the second corresponds to the (k−1)
th
sample, and so on. Each complex coefficient is multiplied by the sample to which it corresponds, and these products are accumulated to generate the k
th
output sample of the filter. For the computation of the (k+1)
th
output sample y(k+1), the first complex coefficient corresponds to the (k+1)
th
sample, the second complex coefficient corresponds to the k
th
sample, and so on. Each complex coefficient is multiplied by the sample to which it corresponds, and these products are accumulated to generate the (k+1)
th
output of the filter. Thus, the correspondence between the samples and the complex coefficients is slide up one for each successive output sample. As a result, FIR filters are typically coded using an outer and an inner loop. The outer loop steps through the successive outputs (the different corresponding relationships between the samples and complex coefficients), while the inner loop steps through the complex coefficients and current corresponding samples to perform the multiply-accumulate.
When a FIR filter is first begun, there are insufficient samples to compute the entire length (L) of the filter (i.e., index k−n into the input samples x() is negative). In such situations, the missing samples are typically substituted with zero, the first sample, or some other relevant input.
The equation for the IIR filter is shown below as Equation 4:
y
⁡
(
k
)
=
∑
n
=
0
L
-
1
⁢
c
⁡
(
n
)
*
x
⁡
(
k
-
n
)
+
∑
i
=
0
M
-
1
⁢
d
⁡
(
i
)
*
y
⁡
(
k
-
i
)
Equation
⁢

⁢
4
With reference to Equation 4, the input value d(i) represents the i
th
filter coefficient of the filter, and the constant M is the number of coefficients in d(i).
One prior art technique for supporting multiply-accumulate operations is to couple a separate digital signaling processor (DSP) to an existing general purpose processor (e.g., The Intel® 486 manufactured by Intel Corporation of Santa Clara, Calif. The general purpose processor allocates jobs to the DSP.
One such prior art DSP is the TMS320C2x DSP manufactured by Texas Instruments, Inc. of Dallas, Tex. A prior art method for performing a complex multiply-accumulate operation on this DSP is to perform the multiply and add operations to generate the real component and add that real component to an accumulation value representing the accumulated real component, and then perform the multiply and add operations to generate the imaginary component and add that imaginary component to an accumulation value representing the accumulated imaginary component. A pseudo code representation of the inner loop of the FIR filter is shown below in Table 1.
TABLE 1
ZAC
;ACC <= 0, other setup code to initialize pointers
YRSTART
;Loop label
LT
*x++
;T <= x.i(n)
MPY
*c++
;P <= T* c.i(n)
LT
*x++
;T <= x.r(n)
MPYS
*c++
;ACC <= ACC − P,P <= T* c.r(n)
APAC
lc−−
;ACC <= ACC + P, decrement loop counter register
BANZ YRSTART
;Jump back to beginning of loop if lc is not zero
SA
*y++
;Store y.r
ZAC
;ACC <= 0, reset the pointers here.
YISTART
;
LT
*x++
;T <= x.i(n)
MPY
*c++
;P <= T* c.r(n)
LT
*x++
;T <= x.r(n)
MPYA
*c++
;ACC <= ACC + P,P <= T*c.i(n)
APAC
lc−−
;ACC <= ACC + P
BANZ YISTART
SA
*y
One limitation of the TMS320C2x DSP is its limited efficiency when performing complex number multiplication and FIR filters. As illustrated by the above pseudo code, the algorithm is basically serial in nature. Thus, it requires approximately 10 instructions to accumulate the result of multiplying together two complex numbers.
Multimedia applications (e.g., applications targeted at computer supported cooperation (CSC—the integration of teleconferencing with mixed media data manipulation), 2D/3D graphics, image processing, video compression/decompression, recognition algorithms and audio manipulation) require the manipulation of large amounts of data which may be represented in a small number of bits. For example, graphical data typically requires 16 bits and sound data typically requires 8 bits. Each of these multimedia application requires one or more algorithms, each requiring a number of operations. For example, an algorithm may require an add, compare and shift operations.
To improve efficiency of multimedia applications (as well as other applications that have the same characteristics), prior art processors provide packed data formats. A packed data format is one in which the bits typically used to represent a single value are broken into a number of fixed sized data elements, each of which represents a separate value. For example, a 64-bit register may be broken into two 32-bit elements, each of which represents a separate 32-bit value. In addition, these prior art processors provide instructions for separately manipulating each element in these packed data types in parallel. For example, a packed add instruction adds together corresponding data elements from a first packed data item and a second packed data item. Thus, if a multimedia algorithm requires a loop containing five operations that must be performed on a large number of data elements, it is desirable to pack the data and perform these operations in parallel using packed data instructions. In this manner, these processors can more efficiently process multimedia applications.
However, if the loop of operations contains an operation that cannot be performed by the processor on packed data (i.e., the processor lacks the appropriate instruction), the data will have to be unpacked to perform the operation. For example,

Affiliated with

Dulong Carole

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Fischer Stephen A.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kowashi Eiichi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mennemeier Larry M.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Peleg Alexander D.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Blakely , Sokoloff, Taylor & Zafman LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Intel Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mai Tan V.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for multiplying and accumulating... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for multiplying and accumulating..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for multiplying and accumulating... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3362817

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure