Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed
Reexamination Certificate
1999-05-26
2002-06-25
Malzahn, David H. (Department: 2121)
Electrical computers: arithmetic processing and calculating
Electrical digital calculating computer
Particular function performed
C708S498000
Reexamination Certificate
active
06411978
ABSTRACT:
FIELD AND BACKGROUND OF THE INVENTION
The present invention relates to digital signal processing apparatus and methods, and, more particularly, to apparatus and a method for block floating point Fast Fourier Transform.
Fast Fourier Transform (FFT) is an efficient algorithm for transforming discrete time signals from a time domain representation to a frequency domain representation, and vice versa. The term “Fast Fourier Transform” is actually a generic term for an entire set of efficient Discrete Fourier Transform (DFT) algorithms. The principle upon which these algorithms are based is that a DFT can be recursively decomposed into smaller DFT's. The most popular decomposition method is the radix-2 Decimation In Time (DIT) FFT.
It is customary to represent a DIT FFT by a prior art flow graph made up of simple graph units known as “butterflies”, as shown in
FIG. 1. A
butterfly flow graph has lines (such as a line
102
), circles (such as a circle
104
), and arrows (such as an arrow
106
) as elements. The circles represents summation and the arrows show the direction of data flow. If an arrow has an adjacent number or expression (such as the constant −1 or the expression W
N
k
), the data value is multiplied by that number or expression. Wherever an arrow does not have an adjacent number or expression, this is considered as an implicit multiplier of 1. Thus, in
FIG. 1
, an output data value vector
108
(c, d) can be expressed in terms of an input vector
110
(a, b) of data values as follows:
c=a+W
N
k
·b
(1)
d=a−W
N
k
·b
(2)
where all the numbers are in general complex numbers. Note that each output of the butterfly is a sum of two numbers, for example c is the sum of a and W
N
k
·b, where the term W
N
kn
is known in the art as a “twiddle factor” and is defined as
W
N
kn
=e
−2&pgr;jnk/N
(3)
where j in Equation (3) represents the imaginary number. The magnitude of W
N
k
is unity. An important characteristic of the product F=W
N
k
·b such as in Equation (1) and Equation (2) is that the multiplication can change the magnitude of the components (real or imaginary) by a factor of 2, because the multiplication involves complex numbers. That is, the larger component of the product F can grow up to 2 times the larger component of b:
max
(
F
real
, F
imag
)≦2
max
(
b
real
, b
imag
) (4)
The main problem of the FFT is that the dynamic range of the complex output data values grows by 2. The dynamic range is the range between the maximum possible value and the minimum possible value:
abs
(
c
)≦2·
max
(
abs
(
a
),
abs
(
b
)) (5)
abs
(
d
)≦2·
max
(
abs
(
a
),
abs
(
b
)) (6)
Such growth can cause overflow when writing numbers to data memory, and it is necessary to prevent such overflows.
A flow graph of a prior art DIT FFT is shown in FIG.
2
. The FFT size in this figure is N=8. As seen in
FIG. 2
, an input
208
to the FFT flow graph is a vector x (n) of complex numbers. This vector passes through log
2
(N) stages, each of which is made of N/2 butterflies. In this example, there is a first stage
202
, a second stage
204
, and a third stage
206
, culminating in an output
210
. Each output node of each stage involves the addition of two data values, as illustrated in FIG.
1
and in Equation (1) and Equation (2). The data values involved are, in general, complex random variables with similar distributions. Although the standard deviation of each output node grows by a factor of 2 (assuming the inputs are independent), the dynamic range of the output grows by a factor of 2. When working with a fixed point digital signal processor (DSP), such an increase in the dynamic range of the numbers might cause overflows when writing them to the data memory, unless there is a mechanism for overflow protection. (An overview of the field is given in
Digital Signal Processing,
by A. V. Oppenheim and R. W. Schaffer, Prentice-Hall, in the chapter covering quantization effects in fixed-point FFT algorithms.) The dynamic range of the magnitude of the complex output grows by a factor of 2 but the dynamic range of the larger of the output's components (real or imaginary) grows by a factor of 1+2. The term “processor” herein denotes any data processing device, including, but not limited to digital signal processors.
The simplest solution for the overflow problem is to divide the FFT input vector by N. This guarantees no overflow for all stages of the algorithm, but suffers from low performance in terms of signal to quantization noise ratio (SQNR), where quantization noise refers to the effects of finite word length. Another solution is to divide (“scale down”) the output data values of each stage by 2, which also guarantees no overflow for all stages of the algorithm. This solution also has better SQNR performance than dividing the input vector by N, but still does not have enough performance for some applications, such as ADSL (Asymmetric Digital Subscriber Line), especially in 16 bit processors. A third solution, which is the best in terms of SQNR, is to adopt the block floating point technique and attain what is known as block floating point (BFP) FFT. In BFP FFT, the scaling is not done at the output of every stage but only at those stages where overflow occurs (or might occur). That is, if overflow occurs in stage k then the whole stage is recalculated such that every output is recalculated, scaled and then stored in the data memory. This improves the SQNR while preserving the dynamic range to prevent overflow.
The problem with this approach is that it is not efficient for real time implementations. The number of cycles varies from execution to execution and depends on the number of scaled stages in each execution.
The best prior art solution in the current DSP's available in the marker is to modify the decision law for scaling such that recalculation will not be required. As previously noted, in the classical BFP FFT the decision law states that if overflow occurs in the current stage, recalculate and scale before storing to the data memory. In the best prior art solution, recalculation is avoided by making a decision whether to scale down the output data value of stage k on the basis of the output data values of stage k−1. By determining in advance that an overflow might occur in a stage before actually performing the computation of that stage, the output of that stage is scaled down regardless of the result, and the time otherwise wasted in performing an unnecessary computation can be saved. This solution is used, for example, in the Motorola DSP56xxx family and 56lxx family. Processors which support this algorithm contain a scale-before-store (SC) bit and a sticky status bit. (A sticky status bit is a status bit that can be set, but not cleared, by a particular condition, and which can be cleared only by a software command, such as a program instruction, or by a hardware reset, and which therefore retains a record of a positive test for the particular condition regardless of subsequent negative tests for that condition.) The term “set” and the terms “clear” (or “cleared”) herein denote distinct preassigned logical states, without any limitation on their specific respective binary or physical representations. For example, a set bit can be represented by a binary 1 and cleared bit can be represented by a binary 0; alternatively, a set bit can be represented by a binary 0 and a cleared bit can be represented by a binary 1. Likewise, as another example, a set bit can be represented by a high voltage level and a cleared bit can be represented by a low voltage level; alternatively, a set bit can be represented by a low voltage level and a cleared bit can be represented by a high voltage level. Any consistent distinct states may be utilized to represent a “set” bit and a “cleared” bit.
The prior art decision process is illustrated conceptually in FIG.
3
. Data required for the process includes an FFT sticky statu
Granot Haim
Naveh Gil
Weingarten Eran
Friedman Mark M.
Infineon Technologies Ag I. Gr.
Malzahn David H.
LandOfFree
Mechanism for block floating point FFT hardware support on a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Mechanism for block floating point FFT hardware support on a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Mechanism for block floating point FFT hardware support on a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2941950