Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-10-13
2002-05-14
{haeck over (S)}mits, T{overscore (a)}livaldis Ivars (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S251000
Reexamination Certificate
active
06389389
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to speech recognition systems and, in particular, to vector representation of speech parameters in signal processing for speech recognition.
BACKGROUND OF THE INVENTION
As a user talks in a speech recognition system, his speech waveform is captured and analyzed. During what is commonly referred to as “front-end” processing, acoustic features of the speech signal are extracted using a variety of signal processing techniques. These features provide a representation of the speech in a more compact format. Such features include (but are not limited to) filterbank channel outputs, linear predictive coding (LPC) coefficients, real cepstrum coefficients, and a variety of pitch and energy measures. These features can be transmitted or passed to a pattern recognition or matching system, commonly called the “back-end,” that compares the incoming acoustic features to speech templates and attempts to postulate what acoustic events (words, phones, etc.) have been spoken.
To save memory or communication channel bandwidth in the “front-end,” the acoustic features may also undergo a quantization step. As will be understood by those skilled in the art, the features represent a time slice of the speech waveform. During vector quantization, a single table or multiple tables of representative feature vectors are searched for the closest match to the current feature vector. When the closest match is found according to a defined distortion measure, the index of the closest match in the table is employed to represent the feature. Certain designs that employ a combination of speech features perform this lookup individually on each speech feature. Various other designs combine the parameters for all the features into one large vector and perform the lookup only once.
Prior art methods have been proposed for quantizing front-end parameters in speech recognition. As mentioned above, a set of features such as the cepstrum or the LPC coefficients, are typically quantized as a set in a single vector. If multiple types of features are present each type of feature is vector quantized as a separate set. When a scalar parameter is used, such as frame energy, the value is quantized with a scalar quantizer. In addition, multiple scalar values are quantized with multiple scalar quantizers.
Such previous techniques have shortcomings. For example, in cases where coefficients are correlated, previous implementations are wasteful of memory needed to store the quantization tables, wasteful of computations to perform the table lookups, and wasteful of memory/bandwidth necessary for storage/transmission of the codebook indices. As another example, one element in a vector previously could dominate a distortion measure used during quantization, due to differences in magnitude or statistical variance.
REFERENCES:
patent: 5544277 (1996-08-01), Bakis et al.
patent: 5751903 (1998-05-01), Swaminathan et al.
patent: 5797119 (1998-08-01), Ozawa
patent: 5926785 (1999-07-01), Akamine et al.
patent: 5956683 (1999-09-01), Jacobs et al.
patent: 6067515 (2000-05-01), Cong et al.
patent: 6070136 (2000-05-01), Cong et al.
patent: 6131084 (2000-10-01), Hardwick
patent: 6161089 (2000-12-01), Hardwick
patent: 6199037 (2001-03-01), Hardwick
patent: 6219642 (2001-04-01), Asghar et al.
Law et al., “A Novel Split Residual Vector Quanitzation Scheme For Low Bit Rate Speech Coding”. IEEE, 1994, pp. 493-496.
Kushner William M.
Meunier Jeffrey A.
Pearce David John
Motorola Inc.
Nichols Daniel K.
{haeck over (S)}mits T{overscore (a)}livaldis Ivars
LandOfFree
Speech recognition using unequally-weighted subvector error... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech recognition using unequally-weighted subvector error..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition using unequally-weighted subvector error... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2872927