Multi-subframe quantization of spectral parameters

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

704219, 704222, G10L 2100

Patent

active

061610895

ABSTRACT:
Speech is encoded into a frame of bits. A speech signal is digitized into a sequence of digital speech samples that are then divided into a sequence of subframes. A set of model parameters is estimated for each subframe. The model parameters include a set of spectral magnitude parameters that represent spectral information for the subframe. Two or more consecutive subframes from the sequence of subframes may be combined into a frame. The spectral magnitude parameters from both of the subframes within the frame may be jointly quantized. The joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from the previous frame, computing the residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters, combining the residual parameters from both of the subframes within the frame, and quantizing the combined residual parameters into a set of encoded spectral bits which are included in the frame of bits.

REFERENCES:
patent: 3706929 (1972-12-01), Robinson et al.
patent: 3975587 (1976-08-01), Dunn et al.
patent: 3982070 (1976-09-01), Flanagan
patent: 4091237 (1978-05-01), Wolnowsky et al.
patent: 4422459 (1983-12-01), Simson
patent: 4583549 (1986-04-01), Manoli
patent: 4618982 (1986-10-01), Horvath et al.
patent: 4622680 (1986-11-01), Zinser
patent: 4720861 (1988-01-01), Bertrand
patent: 4797926 (1989-01-01), Bronson et al.
patent: 4821119 (1989-04-01), Gharavi
patent: 4879748 (1989-11-01), Picone et al.
patent: 4885790 (1989-12-01), McAulay et al.
patent: 4905288 (1990-02-01), Gerson et al.
patent: 4979110 (1990-12-01), Albrecht et al.
patent: 5023910 (1991-06-01), Thomson
patent: 5036515 (1991-07-01), Freeburg
patent: 5054072 (1991-10-01), McAulay et al.
patent: 5067158 (1991-11-01), Arjmad
patent: 5081681 (1992-01-01), Hardwick et al.
patent: 5091944 (1992-02-01), Takahasi
patent: 5095392 (1992-03-01), Shimazaki et al.
patent: 5113448 (1992-05-01), Nomura et al.
patent: 5195166 (1993-03-01), Hardwick et al.
patent: 5216747 (1993-06-01), Hardwick et al.
patent: 5226084 (1993-07-01), Hardwick et al.
patent: 5226108 (1993-07-01), Hardwick et al.
patent: 5247579 (1993-09-01), Hardwick et al.
patent: 5265167 (1993-11-01), Akamine et al.
patent: 5307441 (1994-04-01), Tzeng
patent: 5517511 (1996-05-01), Hardwick et al.
patent: 5596659 (1997-01-01), Normile et al.
patent: 5630011 (1997-05-01), Lim et al.
patent: 5664053 (1997-09-01), Laflamme et al.
patent: 5696873 (1997-12-01), Bartkowiak
patent: 5704003 (1997-12-01), Kleijn et al.
Digital Speech Processing, Synthesis, and Recognition by Sadaoki Furui, p62, p135, 1989
Almeida et al., "Harmonic Coding: A Low Bit-Rate, Good-Quality Speech Coding Technique," IEEE (1982), pp. 1664-1667.
Almeida, et al. "Variable-Frequency Synthesis: An Improved Harmonic Coding Sheme", ICASSP (1984), pp. 27.5.1-27.5.4.
Atungsiri et al., "Error Detection and Control for the Parametric Information in CELP Coders", IEEE (1990), pp. 229-232.
Brandstein et al., "A Real-Time Implementation of the Improved MBE Speech Coder", IEEE (1900), pp. 5-8
Campbell et al., "The New 4800 bps Voice Coding Standard", Mil Speeh Tech Conference (Nov. 1989), pp. 64-70.
Chen et al., "Real-Time Vector APC Speech Coding at 4800 bps with Adaptive Postifiltering", Proc. ICASSP (1987), pp. 2185-2188.
Cox et al., "Subband Speech Coding and Matched Convolutional Channel Coding for Mobile Radio Channels," IEEE Trans. Signal Proc., vol. 39, No. 8 (Aug. 1991), pp. 1717-1731.
Digital Voice Systems, Inc., "INMARSAT-M Voice Codec", Version 1.9 (Nov. 18, 1992), pp. 1-145.
Digital Voice Systems, Inc., "The DVSI IMBE Speech Compression System," advertising brochure (May 12, 1993).
Digital Voice Systems, Inc., "The DVSI IMBE Speech Coder," advertising brochure (May 12, 1993).
Flanagan, J.L., Speech Analysis Synthesis and Perception, Springer-Verlag (1982), pp. 378-386.
Fujimura, "An Approximation to Voice Aperiodicity", IEEE Transactions on Audio and Electroacoutics, vol. AU-16, No. 1 (Mar. 1968), pp. 68-72.
Griffin, et al., "A High Quality 9.6 Kbps Speech Coding System", Proc. ICASSP 86, Tokyo, Japan, (Apr. 13-20, 1986), pp. 125-128.
Griffin et al., "A New Model-Based Speech Analysis/Synthesis System", Proc. ICASSP 85, Tampa, FL (Mar. 26-29, 1985), pp. 513-516.
Griffin, et al. "A New Pitch Detection Algorithm", Digital Signal Processing, No. 84, Elsevier Science Publishers (1984), pp. 395-399.
Griffin et al., "Multiband Excitation Vocoder" IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 36, No. 8 (1988), pp. 1223-1235.
Griffin, "The Multiband Excitation Vocoder", Ph.D. Thesis, M.I.T., 1987.
Griffin et al. "Signal Estimation from Modified Short-Time Fourier Transform", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-32, No. 2 (Apr. 1984), pp. 236-243.
Hardwick et al. "A 4.8 Kbps Multi-band Excitation Speech Coder, " Proceedings from ICASSP, International Conference on Acoustics, Speech and Signal Processing, New York, N.Y. (Apr. 11-14, 1988), pp. 374-377.
Hardwick et al. "A 4.8 Kbps Multi-Band Excitation Speech Coder," Master's Thesis, M.I.T., 1988.
Hardwick et al. "The Application of the IMBE Speech Coder to Mobile Communications," IEEE (1991), pp. 249-252.
Heron, "A 32-Band Sub-band/Transform Coder Incorporating Vector Quantization for Dynamic Bit Allocation", IEEE (1983), pp. 1276-1279.
Levesque et al., "A Proposed Federal Standard for Narrowband Digital Land Mobile Radio", IEEE (1990), pp. 497-501.
Makhoul, "A Mixed-Source Model For Speech Compression and Synthesis", IEEE (1978), p. 163-166.
Makhoul et al., "Vector Quantization in Speech Coding", Proc. IEEE (1985), pp. 1551-1588.
Maragos et al., "Speech Nonlinearities, Modulations, and Energy Operators", IEEE (1991), pp. 421-424.
Mazor et al., "Transform Subbands Coding With Channel Error Control", IEEE (1989), pp. 172-175.
McAulay et al., "Mid-Rate Coding Based on a Sinusoidal Representation of Speech", Proc. IEEE (1985), pp. 945-948.
McAulay et al., Multirate Sinusoidal Transform Coding at Rates from 2.4 Kbps to 8 Kbps., IEEE (1987), pp. 1645-1648.
McAulay et al., "Speech Analysis/Synthesis Based on a Sinusoidal Representation," IEEE Transactions on Acoustics, Speech and Signal Processing V. 34, No. 4, (Aug. 1986), pp. 744-754.
McCree et al., "A New Mixed Excitation LPC Vocoder", IEEE (1991), pp. 593-595.
McCree et al., "Improving the Performance of a Mixed Excitation LPC Vocoder in Acoustic Noise", IEEE (1992), pp. 137-139.
Rahikka et al., "CELP Coding for Land Mobile Radio Applications," Proc. ICASSP 90, Albuquerque, New Mexico, Apr. 3-6, 1990, pp. 465-468.
Rowe et al., "A robust 2400bit/s MBE-LPC Speech Coder Incorporating Joint Source and Channel Coding," IEEE (1992), pp. 141-144.
Secrest, et al., "Postprocessing Techniques for Voice Pitch Trackers", ICASSP, vol. 1 (1982), pp. 172-175.
Tribolet et al., Frequency Domain Coding of Speech, IEEE Transactions on Acoustics, Speech and Signal Processing, V. ASSP-27, No. 5, pp 512-530 (Oct. 1979).
Yu et al., "Discriminant Analysis and Supervised Vector Quantization for Continuous Speech Recognition", IEEE (1990), pp. 685-688.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Multi-subframe quantization of spectral parameters does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Multi-subframe quantization of spectral parameters, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multi-subframe quantization of spectral parameters will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-226039

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.