Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Patent
1997-04-14
1998-10-20
Hudspeth, David R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
704208, 704209, 704226, 704229, G10L 904
Patent
active
058262229
ABSTRACT:
A method of encoding speech by analyzing a digitized speech signal to determine excitation parameters for the digitized speech signal is disclosed. The method includes dividing the digitized speech signal into at least two frequency bands, determining a first preliminary excitation parameter by performing a nonlinear operation on at least one of the frequency band signals to produce a modified frequency band signal and determining the first preliminary excitation parameter using the modified frequency band signal, determining a second preliminary excitation parameter using a method different from the first method, and using the first and second preliminary excitation parameters to determine an excitation parameter for the digitized speech signal. The method is useful in encoding speech. Speech synthesized using the parameters estimated based on the invention generates high quality speech at various bit rates useful for applications such as satellite voice communication.
REFERENCES:
patent: 3706929 (1972-12-01), Robinson et al.
patent: 3975587 (1976-08-01), Dunn et al.
patent: 3982070 (1976-09-01), Flanagan
patent: 3995116 (1976-11-01), Flanagan
patent: 4004096 (1977-01-01), Bauer et al.
patent: 4015088 (1977-03-01), Dubnowski et al.
patent: 4074228 (1978-02-01), Jonscher
patent: 4076958 (1978-02-01), Fulghum
patent: 4091237 (1978-05-01), Wolnowsky et al.
patent: 4441200 (1984-04-01), Fette et al.
patent: 4618982 (1986-10-01), Horvath et al.
patent: 4622680 (1986-11-01), Zinser
patent: 4672669 (1987-06-01), Des Blache et al.
patent: 4696038 (1987-09-01), Doddington et al.
patent: 4720861 (1988-01-01), Bertrand
patent: 4797926 (1989-01-01), Bronson et al.
patent: 4799059 (1989-01-01), Grindahl et al.
patent: 4809334 (1989-02-01), Bhaskar
patent: 4813075 (1989-03-01), Ney
patent: 4879748 (1989-11-01), Picone et al.
patent: 4885790 (1989-12-01), McAulay et al.
patent: 4989247 (1991-01-01), Van Hemert
patent: 5023910 (1991-06-01), Thomson
patent: 5036515 (1991-07-01), Freeburg
patent: 5054072 (1991-10-01), McAulay et al.
patent: 5067158 (1991-11-01), Arjmand
patent: 5081681 (1992-01-01), Hardwick
patent: 5091944 (1992-02-01), Takahashi
patent: 5091946 (1992-02-01), Ozawa
patent: 5095392 (1992-03-01), Shimazaki et al.
patent: 5195166 (1993-03-01), Hardwick et al.
patent: 5216747 (1993-06-01), Hardwick et al.
patent: 5226084 (1993-07-01), Hardwick et al.
patent: 5226108 (1993-07-01), Hardwick et al.
patent: 5247579 (1993-09-01), Hardwick et al.
patent: 5265167 (1993-11-01), Akamine et al.
patent: 5504833 (1996-04-01), George et al.
patent: 5517511 (1996-05-01), Hardwick et al.
Deller, Proakis, Hansen; "Discrete-time processing of speech signals", 1993, Macmillan Publishing Company, p. 460, paragraph 7.4.1; p. 461; figure 7.25.
Kurematsu, et al., "A Linear Predictive Vocoder With New Pitch Extraction and Exciting Source"; 1979 IEEE International Conference on Acoustics; pp. 69-72.
Kurbsack, et al.; "An Autocorrelation Pitch Detector and Voicing Decision with Confidence Measures Developed for Noise-Corrupted Speech"; Feb. 1991; IEEE; vol. 39, No. 2; pp. 319-321.
Cox et al., "Subband Speech Coding and Matched Convolutional Channel Coding for Mobile Radio Channels," IEEE Trans. Signal Proc., vol. 39, No. 8 ( Aug. 1991), pp. 1717-1731.
Digital Voice Systems, Inc., "The DVSI IMBE Speech Compression System," advertising brochure (May 12, 1993).
Digital Voice Systems, Inc., "The DVSI IMBE Speech Coder," advertising brochure (May 12, 1993).
Fujimura, "An Approximation to Voice Aperiodicity", IEEE Transactions on Audio and Electroacoutics, vol. AU-16, No. 1 (Mar. 1968), pp. 68-72.
Griffin, "The Multiband Excitation Vocoder", Ph.D. Thesis, M.I.T., 1987.
Hardwick et al. "The Application of the IMBE Speech Coder to Mobile Communications," IEEE (1991), pp. 249-252.
Heron, "A 32-Band Sub-band/Transform Coder Incorporating Vector Quantization for Dynamic Bit Allocation", IEEE (1983), pp. 1276-1279.
Makhoul, "A Mixed-Source Model For Speech Compression and Synthesis", IEEE (1978), pp. 163-166.
Maragos et al., "Speech Nonlinearities, Modulations, and Energy Operators", IEEE (1991), pp. 421-424.
McCree et al., "A New Mixed Excitation LPC Vocoder", IEEE (1991), pp. 593-595.
McCree et al., "Improving The Performance Of A Mixed Excitation LPC Vocoder In Acoustic Noise", IEEE (1992), pp. 137-139.
Quackenbush et al., "The Estimation and Evaluation Of Pointwise Nonlinearities For Improving The Performance Of Objective Speech Quality Measures", IEEE (1983), pp. 547-550.
Hardwick ("A 4.8 Kbps Multi-Band Excitation Speech Coder", Massachusetts Institute of Technology, May 1988, pp. 1-68).
Hess, Wolfgang J., ("Pitch and Voicing Determination", Advances in Speech Signal Processing, Eds. Sadaoki Furui & M.Mohan Sondhi, Marcel Dekker, Inc., Jan. 1991, pp. 1-48).
Quatieri, et al. "Speech Transformation Based on A Sinusoidal Representation", IEEE, TASSP, vol., ASSP34 No. 6, Dec. 1986, pp. 1449-1464.
Griffin, et al., "A High Quality 9.6 Kbps Speech Coding System", Proc. ICASSP 86, pp. 125-128, Tokyo, Japan, Apr. 13-20, 1986.
Griffin et al., "A New Model-Based Speech Analysis/Synthesis System", Proc. ICASSP 85 pp. 513-516, Tampa. FL., Mar. 26-29, 1985.
Hardwick, "A 4.8 kbps Multi-Band Excitation Speech Coder", S.M. Thesis, M.I.T, May 1988.
McAulay et al., "Mid-Rate Coding Based on a Sinusoidal Representation of Speech", Proc. IEEE 1985 pp. 945-948.
Hardwick et al. "A 4.8 Kbps Multi-band Excitation Speech Coder, " Proceedings from ICASSP, International Conference on Acoustics, Speech and Signal Processing, New York, N.Y., Apr. 11-14, pp. 374-377 (1988).
Griffin et al., "Multiband Excitation Vocoder" IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 36, No. 8, pp. 1223-1235 (1988).
Almeida et al., "Harmonic Coding: A Low Bit-Rate, Good-Quality Speech Coding Technique," IEEE (CH 1746-7/82/0000 1684) pp. 1664-1667 (1982).
Tribolet et al., "Frequency Domain Coding of Speech," IEEE Transactions on Acoustics, Speech and Signal Processing, V. ASSP-27, No. 5, pp. 512-530 (Oct. 1979).
McAulay et al., "Speech Analysis/Synthesis Based on A Sinusoidal Representation," IEEE Transactions on Acoustics, Speech and Signal Processing V. 34, No. 4, pp. 744-754, (Aug. 1986).
Griffin, et al. "A New Pitch Detection Algorithm", Digital Signal Processing, No. 84, pp. 395-399.
McAulay, et al., "Computationally Efficient Sine-Wave Synthesis and Its Application to Sinusoidal Transform Coding", IEEE 1988, pp. 370-373.
Portnoff, Short-Time Fourier Analysis of Sampled Speech, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-29, No. 3, Jun. 1981, pp. 324-333.
Griffin et al. "Signal Estimation from modified Short t-Time Fourier Transform", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-32, No. 2, Apr. 1984, pp. 236-243.
Almeida, et al. "Variable-Frequency Synthesis; An Improved Harmonic Coding Scheme", ICASSP 1984 pp. 27.5.1-27.5.4.
Flanagan, J.L., Speech Analysis Synthesis and Perception, Springer-Verlag, 1982, pp. 378-386.
Secrest, et al., "Postprocessing Techniques for Voice Pitch Trackers", ICASSP, vol. 1, 1982, pp. 171-175.
Patent Abstracts of Japan, vol. 14, No. 498 (P-1124), Oct. 30, 1990.
Mazor et al., "Transform Subbands Coding With Channel Error Control", IEEE 1989, pp. 172-175.
Brandstein et al., "A Real-Time Implementation of the Improved MBE Speech Coder", IEEE 1990, pp. 5-8.
Levesque et al., "A Proposed Federal Standard for Narrowband Digital Land Mobile Radio", IEEE 1990, pp. 497-501.
Yu et al., "Discriminant Analysis and Supervised Vector Quantization for Continuous Speech Recognition", IEEE 1990, pp. 685-688.
Jayant et al., Digital Coding of Waveform, Prentice-Hall, 1984.
Atungsiri et al., "Error Detection and Control for the Parametric Information in CELP Coders", IEEE 1990, pp. 229-232.
Digital Voice Systems, Inc., "Inmarsat-M Voice Coder", Version 1.9, Nov. 18, 1992.
Campbell et al., "The New 4800 bps Voice Coding Standard", Mil Speech Tech Conference, Nov. 1989.
Chen et al., "Real-Time Vector APC Speech Coding at 4800 bps with Adaptive Postfiltering", Proc. ICASSP
Chawan Vijay B.
Digital Voice Systems, Inc.
Hudspeth David R.
LandOfFree
Estimation of excitation parameters does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Estimation of excitation parameters, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Estimation of excitation parameters will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-259772