Pitch determination using speech classification and prior...

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Pitch determination using speech classification and prior... Pitch determination using speech classification and prior...

: 1998-09-18
: 2003-01-14
: Tsang, Fan (Department: 2645)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: For storage or transmission

: C704S219000
: Reexamination Certificate
: active
: 06507814
: ABSTRACT:

BACKGROUND
1. Technical Field
The present invention relates generally to speech encoding and decoding in voice communication systems; and, more particularly, it relates to various techniques used with code-excited linear prediction coding to obtain high quality speech reproduction through a limited bit rate communication channel.
2. Related Art
Signal modeling and parameter estimation play significant roles in communicating voice information with limited bandwidth constraints. To model basic speech sounds, speech signals are sampled as a discrete waveform to be digitally processed. In one type of signal coding technique called LPC (linear predictive coding), the signal value at any particular time index is modeled as a linear function of previous values. A subsequent signal is thus linearly predictable according to an earlier value. As a result, efficient signal representations can be determined by estimating and applying certain prediction parameters to represent the signal,
Applying LPC techniques, a conventional source encoder operates on speech signals to extract modeling and parameter information for communication to a conventional source decoder via a communication channel. Once received, the decoder attempts to reconstruct a counterpart signal for playback that sounds to a human ear like the original speech.
A certain amount of communication channel bandwidth is required to communicate the modeling and parameter information to the decoder. In embodiments, for example where the channel bandwidth is shared and real-time reconstruction is necessary, a reduction in the required bandwidth proves beneficial. However, using conventional modeling techniques, the quality requirements in the reproduced speech limit the reduction of such bandwidth below certain levels.
With CELP type speech coders, mistakes in estimating pitch lag causes degradation in resulting speech quality. In conventional speech coders, such mistakes often occur for example in incorrectly identifying a pitch lag value that is actually double or triple that of the actual pitch lag sought. Similarly, incorrect identification sometimes yields a pitch lag value that is less and even half that of the actual pitch lag sought.
Further limitations and disadvantages of conventional systems will become apparent to one of skill in the art after reviewing the remainder of the present application with reference to the drawings.
SUMMARY OF THE INVENTION
Various aspects of the present invention can be found in a speech encoding system using an analysis by synthesis approach on a speech signal that has a previous pitch lag and a current pitch lag. The speech encoding system comprises an adaptive codebook and an encoder processing circuit. The encoder processing circuit identifies a plurality of pitch lag candidates. From these candidates, the encoder processing circuit attempts to identify the current pitch lag by selecting one of the plurality of pitch lag candidates after considering timing relationships between the previous pitch lag and at least one of the plurality of pitch lag candidates.
The encoder processing circuit may also identify integer multiple timing relationships between at least two of the plurality of pitch lag candidates. Such a timing relationship may also be used in the selection of the one of the plurality of pitch lag candidates.
The consideration of the timing relationships between the previous pitch lag and one of the pitch lag candidates may involve favoring that candidate because the favored candidate and the previous pitch lag have at least close to a same value.
In some embodiments, the aforementioned “favoring” involves application of a weighting factor to at least one of the plurality of pitch lag candidates. The pitch lag candidates may be found by applying correlation techniques, and wherein the weighting factor is applied to such correlation.
Further aspects of the present invention can be found in a method used by a speech encoding system that applies an analysis by synthesis coding approach to a speech signal. The method employed may comprise the identification of a plurality of pitch lag candidates. The encoding system also uses an adaptive weighting factor to favor at least one of the pitch lag candidates over at least one other of the pitch lag candidates. One of the plurality of pitch lag candidates is selected as a current pitch lag estimate.
The method may further involve adjustments of the adaptive weighting factor. For example, the encoder system may adjust the adaptive weighting factor if an integer multiple timing relationship is detected between at least two of the plurality of pitch lag candidates. Similarly, adjustments may be made if a timing relationship is detected between the previous pitch lag and any one of the plurality of pitch lag candidates. Moreover, the variations and aspects of the speech encoder system described above may also apply to this method.

REFERENCES:
patent: 4653098 (1987-03-01), Nakata et al.
patent: 5495555 (1996-02-01), Swaminathan
patent: 5596676 (1997-01-01), Swaminathan et al.
patent: 5734789 (1998-03-01), Swaminathan et al.
patent: 5774836 (1998-06-01), Bartkowiak et al.
patent: 5878388 (1999-03-01), Nishiguchi et al.
patent: 5893060 (1999-04-01), Honkanen et al.
patent: 6006177 (1999-12-01), Funaki
patent: 6052661 (2000-04-01), Yamaura et al.
patent: 6067518 (2000-05-01), Morii
patent: 6073092 (2000-06-01), Kwon
patent: 0532225 (1993-03-01), None
patent: 0628947 (1994-12-01), None
patent: 0720145 (1996-07-01), None
patent: 0877355 (1998-11-01), None
Jean Rouat, Yong Chun Liu, and Daniel Morissette, “A Pitch Determination and Voiced/Unvoiced Decision Algorithm for Noisy Speech”,1997 Elsevier B.V., Speech COmmunication, 21 (1997), pp. 191-207.
W. Bastiaan Kleijn, Ravi P. Ramachandran, and Peter Kroon, IEEE publication, Generalized Analysis-By-Synthesis Coding and Its Application To Pitch Prediction, 1992, pp. I-337-I-340.
W. Bastiaan Kleijn, Ravi P. Ramachangran, and Peter Kroon, IEEE Transactions on Speech and Audio Processing, vol. 2, No.1, Part 1, Jan.1994, Interpolation of the Pitch-Predictor Parameters in Analysis-by-Synthesis Speech Coders, pp. 42-54.
B.S. Atal, V. Cuperman, and A. Gersho (Editors), Advances in Speech Coding, Kluwer Academic Publishers; I. A. Gerson and M.A. Jasiuk (Authors), Chapter 7: “Vector Sum Excited Linear Prediction (VSELP),” 1991, pp. 69-79.
B.S. Atal, V. Cuperman, and A. Gersho (Editors), Advances in Speech Coding, Kluwer Academic Publishers; J.P. Campbell, Jr., T.E. Tremain, and V.C. Welch (Authors), Chapter 12: “The DOD 4.8 KBPS Standard (Proposed Federal Standard 1016),” 1991, pp. 121-133.
B.S. Atal, V. Cuperman, and A. Gersho (Editors), Advances in Speech Coding, Kluwer Academic Publishers; R.A. Salami (Author), Chapter 14: “Binary Pulse Excitation: A Novel Approach to Low Complixity CELP Coding,” 1991, pp. 145-157.
W. Bastiaan Kleijn and Peter Kroon, “The RCELP Speech-Coding Algorithm,” vol. 5, No. 5, Sep.-Oct. 1994, pp. 39/573-47/581.
C. Laflamme, J-P. Adoul, H.Y. Su, and S. Morissette, “On Reducing Computational Complexity of Codebook Search in CELP Coder Through the Use of Algebraic Codes,” 1990, pp. 177-180.
Chih-Chung Kuo, Fu-Rong Jean, and Hsiao-Chuan Wang, “Speech Classification Embedded in Adaptive Codebook Search for Low Bit-Rate CELP Coding,” IEEE Transactions on Speech and Audio Processing, vol. 3, No. 1, Jan. 1995, pp. 1-5.
Erdal Paksoy, Alan McCree, and Vish Viswanathan, “A Variable-Rate Multimodal Speech Coder with Gain-Matched Analysis-By-Synthesis,” 1997, pp. 751-754.
Gerhard Schroeder, “International Telecommunication Union Telecommunications Standardization Sector,” Jun. 1995, pp. i-iv, 1-42.
“Digital Cellular Telecommunications System; Comfort Noise Aspects for Enhanced Full Rate (EFR) Speech Traffic Channels (GSM 06.62),” May 1996, pp. 1-16.
W. B. Kleijn and K.K. Paliwal (Editors), Speech Coding and Synthesis, Elsevier Science B.V.; Kroon and W.B. Kleijn (Authors), Chapter 3: “Linear-Prediction Based on Analysis-by-Synthesis Coding”, 1995, pp. 81-113.
W. B. Kleijn and K.K. Paliwal (Edit

Affiliated with

Gao Yang

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Conexant Systems Inc.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Opsasnick Michael N.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Tsang Fan

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pitch determination using speech classification and prior... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Pitch determination using speech classification and prior..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pitch determination using speech classification and prior... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3036856

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure