Method and apparatus for variable rate coding of speech

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S223000, C704S208000

Reexamination Certificate

active

06510407

ABSTRACT:

TECHNICAL FIELD OF THE INVENTION
The present invention relates generally to speech analysis and more particularly to an efficient coding scheme for compressing speech.
BACKGROUND ART
Speech coding technology has advanced tremendously in recent years. Speech coders in wire and wireless telephony standards such as G.729, G.723 and the emerging GSM AMR have demonstrated very good quality at a rate of about 8 kbps and lower. The U.S. Federal Standard coder further shows that good quality synthesized speech can be achieved at rates as low as 2.4 kbps.
While these coders fulfill the demand in the rapidly growing telecommunication market, consumer electronics applications are still lacking in adequate speech coders. Typical examples include consumer items such as answering machines, dictation devices and voice organizers. In these applications, the speech coder must provide good quality reproduction in order to gain commercial acceptance, and high compression ratios in order to keep storage requirements of the recorded material to a minimum. On the other hand, interoperability with other coders is not a requirement, since these devices are standalone units. Consequently, there is no need to adhere to a fixed bit rate scheme or to coding delay restrictions.
Therefore a need exists for a low bit rate speech coder capable of providing high quality synthesized speech. It is desirable to incorporate the loosened restrictions of standalone applications to provide a high quality, low cost coding scheme.
SUMMARY OF THE INVENTION
The speech encoding method of the present invention is based on analysis-by-synthesis and includes sampling a speech input to produce a stream of speech samples. The samples are grouped into a first set of groups (frames). Linear predictive coding (LPC) coefficients for a speech synthesis filter are computed from an analysis of the frames. The speech samples are further grouped into a second set of groups (subframes). These subframes are analyzed to produce coded speech. Each subframe is categorized into an unvoiced, voiced or onset category. Based on the category, a certain coding scheme is selected to encode the speech sample comprising the group. Thus, for unvoiced speech a gain/shape encoding scheme is used. If the speech is onset speech, a multi-pulse modeling technique is employed. For voiced speech, a further determination is made based on the pitch frequency of such speech. For low pitch frequency voiced speech, encoding is accomplished by the computation of a long term predictor plus a single pulse. For high pitch frequency voiced speech, the encoding is based on a series of pulses spaced apart by a pitch period.


REFERENCES:
patent: 4701954 (1987-10-01), Atal
patent: 4817157 (1989-03-01), Gerson
patent: 4910781 (1990-03-01), Ketchum et al.
patent: 5086471 (1992-02-01), Tanaka et al.
patent: 5799272 (1998-08-01), Zhu
patent: 5826221 (1998-10-01), Aoyagi
patent: 5832180 (1998-11-01), Nomura
patent: 6233550 (2001-05-01), Gersho et al.
patent: 6311154 (2001-10-01), Gersho et al.
patent: 0 751 494 (1997-01-01), None
Tian, W.S., Wong, W.C., Law, C.Y. and Tan, A.P., “Pitch Synchronus Extended Excitation In Multimode CELP”, IEEE Communications Letters, vol. 3,No. 9,Sep. 1999,pp. 275-276.
Dervaux, F.,Gruet, C., and Delprat, M., “Performance And Optimization Of A GSM Half Rate Candidate”U.S. Boston, Kluwer, Jan. 1, 1993, pp. 93-99.
Paksoy, Erdal, Srinivasan, K., and Gersho, Allen, “Variable Rate Speech Coding With Phonetic Segmentation”, Proceedings of ICASSP, Apr. 27, 1993, pp. II-155-158.
Atal, B.S., Cuperman V., and Gersho, A. (eds.), “Advances in Speech Coding,” Wang, S. and Gersho, A.,Kluwer Academic Publishers, 1991, pp. 225-234.
Wang, S., and Gersho, A., “Phonetically-Based Vector Excitation Coding of Speech at 3.6 kbps,” Dept. of Electrical and Computer Engineering, Univ. of California, Santa Barbara, May 23, 1989.
Atal, B.S., “High-Quality Speech at Low Bit Rates: Multi-Pulse and Stochastically Excited Linear Predictive Coders,” Proceedings of IEEE ICASSP 1986, pp. 1681-1684.
Schroeder, M.R. and Atal, B.S., “Code-Excited Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates,” Proceedings IEEE ICASSP 1985, pp. 937-940.
Atal, B.S. and Remde, J.R., “A New Model of LPC Excitation for Producing Natural-Sounding Speech at Low Bit Rates,” Proceedings of IEEE ICASSP 1982, pp. 614-617.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for variable rate coding of speech does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for variable rate coding of speech, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for variable rate coding of speech will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3069303

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.