Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2001-01-05
2003-03-04
Knepper, David D. (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S226000
Reexamination Certificate
active
06529867
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to speech coding, and more particularly, to a system that enhances the perceptual quality of digital processed speech.
2. Related Art
Speech synthesis is a complex process that often requires the transformation of voiced and unvoiced sounds into digital signals. To model sounds, the sounds are sampled and encoded into a discrete sequence. The number of bits used to represent the sounds can determine the perceptual quality of synthesized sound or speech. A poor quality replica can drown out voices with noise, lose clarity, or fail to capture the inflections, tone, pitch, or co-articulations that can create adjacent sounds.
In one technique of speech synthesis known as Code Excited Linear Predictive Coding (CELP) a sound track is sampled into a discrete waveform before being digitally processed. The discrete waveform is then analyzed according to certain select criteria. Criteria such as the degree of noise content and the degree of voice content can be used to model speech through linear functions in real and in delayed time. These linear functions can capture information and predict future waveforms.
The CELP coder structure can produce high quality reconstructed speech. However, coder quality can drop quickly when its bit rate is reduced. To maintain a high coder quality at a low bit rate, such as 4 Kbps, additional approaches must be explored. This invention is directed to providing an efficient coding system of voiced speech and to a method that accurately encodes and decodes the perceptually important features of voiced speech.
SUMMARY
This invention is a system that seamlessly improves the encoding and the decoding of perceptually important features of voiced speech. The system uses modified pulse excitations to enhance the perceptual quality of voiced speech at high frequencies. The system includes a pulse codebook, a noise source, and a filter. The filter connects an output of the noise source to an output of the pulse codebook. The noise source may generate a white noise, such as a Gaussian white noise, that is filtered by a high pass filter. The pass band of the filter passes a selected portion of the white Gaussian noise. The filtered noise is scaled, windowed, and added to a single pulse to generate an impulse response that is convoluted with the output of the pulse codebook.
In another aspect, an adaptive high-frequency noise is injected into the output of the pulse codebook. The magnitude of the adaptive noise is based on a selectable criteria such as the degree of noise like content in a high-frequency portion of a speech signal, the degree of voice content in a sound track, the degree of unvoiced content in a sound track, the energy content of a sound track, the degree of periodicity in a sound track, etc. The system generates different energy or noise levels that targets one or more of the selected criteria. Preferably, the noise levels model one or more important perceptual features of a speech segment.
Other systems, methods, features and advantages of the invention will be or will become apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention, and be protected by the accompanying claims.
REFERENCES:
patent: 5692102 (1997-11-01), Pan
patent: 5699477 (1997-12-01), McCree
patent: 5966689 (1999-10-01), McCree
patent: 5991717 (1999-11-01), Minde et al.
patent: 6134518 (2000-10-01), Cohen et al.
patent: 6240386 (2001-05-01), Thyssen et al.
Laroche et al., :HNS: Speech Modification Based on a Harmonic+Noise Model, IEEE, 1993, pp. II-550 to II-553.
Conexant Systems Inc.
Farjami & Farjami LLP
Knepper David D.
LandOfFree
Injecting high frequency noise into pulse excitation for low... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Injecting high frequency noise into pulse excitation for low..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Injecting high frequency noise into pulse excitation for low... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3049798