Parallel multiprocessing for the fast fourier transform with...

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Parallel multiprocessing for the fast fourier transform with... Parallel multiprocessing for the fast fourier transform with...

: 2001-03-10
: 2004-09-14
: Mai, Tan V. (Department: 2124)
: Electrical computers: arithmetic processing and calculating
: Electrical digital calculating computer
: Particular function performed

: C708S406000, C708S409000
: Reexamination Certificate
: active
: 06792441
: ABSTRACT:

FIELD OF INVENTION
The present invention relates to the field of Fast Fourier Transform analysis. In particular, the present invention relates to a parallel processing architecture adapted for use in a pipelined Fast Fourier Transform method and apparatus.
BACKGROUND OF THE INVENTION
Physical parameters such as light, sound, temperature, velocity and the like are converted to electrical signals by sensors. An electrical signal may be represented in the time domain as a variable that changes with time. Alternatively, a signal may be represented in the frequency domain as energy at specific frequencies. In the time domain, a sampled data digital signal is a series of data points corresponding to the original physical parameter. In the frequency domain, a sampled data digital signal is represented in the form of a plurality of discrete frequency components such as sine waves. A sampled data signal is transformed from the time domain to the frequency domain by the use of the Discrete Fourier Transform (DFT). Conversely, a sampled data signal is transformed back from the frequency domain into the time domain by the use of the Inverse Discrete Fourier Transform (IDFT).
The Discrete Fourier Transform is a fundamental digital signal-processing transformation used in many applications. Frequency analysis provides spectral information about signals that are further examined or used in further processing. The DFT and IDFT permit a signal to be processed in the frequency domain. For example, frequency domain processing allows for the efficient computation of the convolution integral useful in linear filtering and for signal correlation analysis. Since the direct computation of the DFT requires a large number of arithmetic operations, the direct computation of the DFT is typically not used in real time applications.
Over the past few decades, a group of algorithms collectively known as Fast Fourier Transform (FFT) have found use in diverse applications, such as digital filtering, audio processing and spectral analysis for speech recognition. The FFT reduces the computational burden so that it may be used for real-time signal processing. In addition, the fields of applications for FFT analysis are continually expanding.
Computational Burden
Computation burden is a measure of the number of calculations required by an algorithm. The DFT process starts with a number of input data points and computes a number of output data points. For example, an 8-point DFT may have an 8-point output. See FIG.
1
A. The DFT function is a sum of products, i.e., multiplications to form product terms followed by the addition of product terms to accumulate a sum of products (multiply-accumulate, or MAC operations). See equation (1) below. The direct computation of the DFT requires a large number of such multiply-accumulate mathematical operations, especially as the number of input points is made larger. Multiplications by the twiddle factors w
N
r
dominate the arithmetic workload.
To reduce the computational burden imposed by the computationally intensive DFT, previous researchers developed the Fast Fourier Transform (FFT) algorithms in which the number of required mathematical operations is reduced. In one class of FFT methods, the computational burden is reduced based on the divide-and-conquer approach. The principle of the divide-and-conquer approach method is that a large problem is divided into smaller sub-problems that are easier to solve. In the FFT case, the division into sub-problems means that the input data are divided in subsets for which the DFT is computed to form partial DFTs. Then the DFT of the initial data is reconstructed from the partial DFTs. See N. W. Cooley and J. W. Tukey, “An algorithm for machine calculation of complex Fourier series”, Math. Comput., Vol. 19 pp. 297-301, April 1965. There are two approaches to dividing (also called decimating) the larger calculation task into smaller calculation sub-tasks: decimation in frequency (DIF) and decimation in time (DIT).
Butterfly Implementation of the DFT
For example, an 8-point DFT can be divided into four 2-point partial DFTs. The basic 2-point partial DFT is calculated in a computational element called a radix-2 DIT butterfly (or butterfly-computing element) as represented in
FIG. 2A
1. Similarly to the DIT butterfly-computing element,
FIG. 2A
2 shows the function of a radix-2 DIF butterfly. A radix-2 butterfly has 2 inputs and 2 outputs, and computes a 2-point DFT.
FIG. 2B
shows an FFT using 12 radix-2 butterflies to compute an 8-point DFT. Butterfly-computing elements are arranged in stages. There are three stages
1302
,
1304
and
1306
of butterfly calculation. Data, x
n
is fed to the input of the butterfly-computing elements in the first stage
1302
. After the first stage
1302
of butterfly-computation is complete, the result is fed to the in input of the next stage(s) of butterfly-computing element(s) and so on.
In particular, four radix-2 butterflies operate in parallel in the first stage
1302
to compute 8 partial DFTs. The 8 outputs of the first stage
1302
are combined in 2 additional stages
1304
,
1306
to form a complete 8-point DFT output, X
n
. Specifically, the second stage
1304
of 4 radix-2 butterflies and the third stage
1306
of 4 radix-2 butterflies comprise a two stage combination phase in which 8 radix-2 butterflies responsive to 8 partial DFTs form the final 8-point DFT function, X
n
.
FIG. 2C
shows an FFT using 32 radix-2 butterflies to compute a 16-point DFT. There are 4 stages of butterfly calculation. Eight radix-2 butterflies operate in parallel in the first stage
1402
where 2-point partial DFTs are calculated. The outputs of the first stage are combined in 3 additional combination stages
1403
,
1404
and
1406
to form a complete 16-point DFT output. The output of the second stage
1403
of 8 radix-2 butterflies is coupled to a third stage
1404
of 8 radix-2 butterflies. The output of the third stage
1404
of 8 radix-2 butterflies is coupled to a fourth stage
1406
of 8 radix-2 butterflies, the output of which the final 16-point DFT function. The combination phases
1403
,
1404
,
1406
comprise a combination phase in which 24 radix-2 butterflies responsive to 16 partial DFTs (from the first phase
1402
) form the final 16 point DFT function, X
n
.
Higher order butterflies may be used. See
FIG. 2D
, which uses 8 radix-4 butterflies in 2 stages
1502
,
1502
to compute a 16-point DFT. In general, a radix-r butterfly is a computing element that has r input points and calculates a partial DFT of r output points. In
FIG. 2D
, four radix-4 butterflies compute 16 partial DFTs in a first stage
1502
. The combination phase
1504
comprises four radix-4 butterflies responsive to 16 partial DFTs (from the first phase
1502
) to form the final 16 point DFT function, X
n
.
Communication Burden
A computational problem involving a large number of calculations may be performed one calculation at a time by using a single computing element. While such a solution uses a minimum of hardware, the time required to complete the calculation may be excessive. To speed up the calculation, a number of computing elements may be used in parallel to perform all or some of the calculations simultaneously. A massively parallel computation will tend to require an excessively large number of parallel-computing elements. Even so, parallel computation is limited by the communication burden. For example, a large number of data and constants may have to be retrieved from memory over a finite capacity data bus. In addition, intermediate results in one parallel-computing element may have to be communicated to another parallel-computing element. The communication burden of an algorithm is a measure of the amount of data that must be moved, and the number of calculations that must be performed in sequence (i.e., that cannot be performed in parallel).
In particular, in a butterfly implementation of the DFT, some of the butterfly calculations cannot be performed simultaneously, i.e., in parallel. Subsequent stages of butter

Affiliated with

Jaber Marwan A

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Jaber Associates LLC

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Jacobson Allan

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mai Tan V.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Parallel multiprocessing for the fast fourier transform with... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Parallel multiprocessing for the fast fourier transform with..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Parallel multiprocessing for the fast fourier transform with... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3237672

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure