Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2000-08-02
2003-05-20
McFadden, Susan (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S250000, C704S270000
Reexamination Certificate
active
06567777
ABSTRACT:
FIELD OF THE INVENTION
The present invention is directed to methods and apparatus for rapid and efficient approximation of magnitude spectra, for uses such as speech and other pattern recognition and processing, as well as more generalized two-dimensional and three-dimensional Euclidean vector (magnitude and/or phase) approximation for coordinate transforms and other applications.
BACKGROUND OF THE INVENTION
Magnitude spectra, including frequency domain intensity and power spectra, are useful for determining and processing the frequency components of a time or spatial domain signal, such as may be produced by Fourier or other complex number transformation of spatial and/or time-based data. Both the Fourier transformation and the magnitude spectrum calculation can cause a heavy computational load on a host microprocessor or digital signal processor (DSP). Fourier transformation is conventionally carried out by special or general purpose electronic computer or optical system in rectangular form, in which time-domain data is transformed to the frequency domain [x(t)]. The Fourier transform is a complex number transform producing an array of complex numbers each defined as Re+iIm, where Re is referred to as the real part of the frequency domain, i is the imaginary number, i={square root over (−1)}, and Im is referred to as the imaginary part of the frequency domain, as follows:
X
d
⁡
(
k
)
=
1
N
⁢
∑
N
-
1
n
-
0
⁢
x
⁡
(
n
)
⁢
ⅇ
-
ⅈ2π
⁢
⁢
kn
/
N
or
X
d
⁡
(
k
)
=
1
N
⁢
∑
N
-
1
n
-
0
⁢
x
⁡
(
n
)
⁢
(
2
⁢
π
⁢
⁢
kn
N
-
ⅈsin
⁢
⁢
2
⁢
π
⁢
⁢
kn
N
)
Together, these two complex number parts form an array of complex-valued pairs [Re, Im], in which the complex quantities are typically represented as vectors in rectangular (Cartesian) coordinates. The rectangular form may be converted into a polar form also having a function of two parts, magnitude (M) and phase (&thgr;), which can be represented as a rotating vector in polar coordinates. The magnitude component M is transformed from the real and imaginary parts of the rectangular form by calculating the square root of the sum of squares, such that M={square root over (Re
2
+Im
2
)}. The phase component &thgr; is similarly transformed as the arctangent of the imaginary part, Im, divided by the real part, Re, such that &thgr;=Arctan (Im/Re). Magnitude calculation is also utilized in Euclidean distance determination and vector calculations, and generally in coordinate transformations of rectangular to polar, spherical and cylindrical coordinate systems. In such transformations, phase angle information is an important component of the complete coordinate transform, together with the magnitude or Euclidean distance.
Direct floating/fixed point implementation of the magnitude spectrum calculation is conventionally carried out in appropriately programmed general or special purpose computer systems. In such direct calculations, the real and imaginary parts of each spectral component may be squared and then summed, and finally “square rooted”. However, direct calculation of the magnitude spectra is relatively slow, and computationally intense, for applications and systems requiring large data throughputs such as those for speech recognition and image analysis or generation. For example, in a typical general purpose computer chip such as a Pentium II® a squaring operation may take 3 times as long as any arithmetic or logical operation. On a Motorola M*Core™ Risc microprocessor, 2 bits of a multiply are resolved per clock cycle, such that conventional 16 bit precision multiplication would typically be performed over 8 clock cycles, while an arithmetic or logical operation executes in a single clock cycle (“arithmetic” as used herein means primitive register functions add/subtract/increment/decrement or the like and “logical” means primitive logic operations not/or/and/shift_left/shift_right/xor or the like). Numerical squaring operations in general purpose or dedicated computer systems are computationally more time-consuming and require significantly higher hardware capability and capacity than addition operations. Squaring of data can produce relatively large numbers, requiring increased hardware system precision, and will require registers larger than would be needed for primitive addition or subtraction operations. Conventional square root extraction is significantly more computationally intensive for computer apparatus than multiplication (squaring) of numbers. Although a variety of computational methods may be used, square root determination using a conventional computer chip such as Pentium III® or the M*CORE® microprocessors may take 15 times as long as an arithmetic operation. In order to speed up the square root operation, direct computation may be replaced by table operation, in which pre-calculated, tableized, fixed-point data array is used to generate output. In this case a table of square roots is first generated for the anticipated range of the expression (Re
2
+Im
2
). Then the Re, Im number pair is used to address the values in the table, to select a precalculated result for that number pair. However, in order to provide significant accuracy for a range of Re, Im values, the table should be relatively large, which requires significant system memory. A 1024×1024 table to produce 8 bit output precision may typically require about 1 MB of system memory, while 16 bit precision over the same table requires 2 MB memory. However if finer resolution is needed in the table, such as 12 bit by 12 bit lookup which is equivalent to a 4096 by 4096 table, 8 bit output precision requires 16 MB of memory, and 16 bit output precision requires about 32 MB of memory. A power series approximation, such as a Taylor expansion requiring numerous multiplications, may also be used to calculate magnitude spectra, but this is also computationally more time-consuming and hardware-intensive than simple addition and subtraction operations.
Methods and apparatus for effectively carrying out speech recognition and image processing or generation in “real time” typically require processing of large amounts of voice or image data, typically including magnitude transform determination. For example, U.S. Pat. No. 5,960,394 to Gould, et al. (Dragon Systems, Inc.), U.S. Pat. No. 5,749,066 to Nussbaum (Ericsson Messaging Systems Inc.), U.S. Pat. No. 5,890,103 to Carus (Lernout & Haupsie Speech Products N.C.), U.S. Pat. No. 5,640,485 to Ranta (Nokia Mobile Phones, Ltd.), U.S. Pat. No. 4,956,865 to Lennig, et al. (Northern Telecom Limited), U.S. Pat. No. 4,283,601 to Nakijima, et al. (Hitachi, Ltd.), U.S. Pat. No. 5,583,961 to Pawlewski, et al. (British Telecommunications), U.S. Pat. No. 5,465,318 to Sejnoba (Kurzweil Applied Intelligence, Inc.), U.S. Pat. No. 5,054,074 to Bakis (IBM) and the references cited therein (which are incorporated by reference herein), describe a wide variety of known speech recognition and speech processing systems which illustrate the computational intensity of such systems.
Improved computational equipment and methods would be desirable, particularly for low cost and portable systems which efficiently and effectively carry out magnitude transforms for such applications. Accordingly, particularly for portable, low power systems such as cellular telephones, and other hand held or portable devices and appliances which utilize voice and/or image processing, there is a need for efficient new, relatively inexpensive systems for approximating the magnitude of complex numbers.
It is an object of the present invention to provide methods and computational apparatus which can efficiently and effectively approximate magnitude spectra for voice recognition and similar uses.
Separate and alternative objects to provide methods and equipment for rapidly and efficiently approximating phase of complex numerical data, for transforming data from cartesian (rectilinear) coordinate system repr
McFadden Susan
Motorola Inc.
Wantanabe Hisashi D.
LandOfFree
Efficient magnitude spectrum approximation does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Efficient magnitude spectrum approximation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient magnitude spectrum approximation will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3030305