Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2000-09-15
2003-06-03
To, Doris H. (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S503000
Reexamination Certificate
active
06574593
ABSTRACT:
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights.
MICROFICHE REFERENCE
A microfiche appendix is included of a computer program listing. The total number of microfiche is 7. The total number of frames is 679.
BACKGROUND OF THE INVENTION
1. Technical Field
This invention relates to speech communication systems and, more particularly, to systems for digital speech coding.
2. Related Art
One prevalent mode of human communication is by the use of communication systems. Communication systems include both wireline and wireless radio based systems. Wireless communication systems are electrically connected with the wireline based systems and communicate with the mobile communication devices using radio frequency (RF) communication. Currently, the radio frequencies available for communication in cellular systems, for example, are in the cellular frequency range centered around 900 MHz and in the personal communication services (PCS) frequency range centered around 1900 MHz. Data and voice transmissions within the wireless system have a bandwidth that consumes a portion of the radio frequency. Due to increased traffic caused by the expanding popularity of wireless communication devices, such as cellular telephones, it is desirable to reduced bandwidth of transmissions within the wireless systems.
Digital transmission in wireless radio communications is increasingly applied to both voice and data due to noise immunity, reliability, compactness of equipment and the ability to implement sophisticated signal processing functions using digital techniques. Digital transmission of speech signals involves the steps of sampling an analog speech waveform with an analog-to-digital converter, speech compression (encoding), transmission, speech decompression (decoding), digital-to-analog conversion, and playback into an earpiece or a loudspeaker. The sampling of the analog speech waveform with the analog-to-digital converter creates a digital signal. However, the number of bits used in the digital signal to represent the analog speech waveform creates a relatively large bandwidth. For example, a speech signal that is sampled at a rate of 8000 Hz (once every 0.125 ms), where each sample is represented by 16 bits, will result in a bit rate of 128,000 (16×8000) bits per second, or 128 Kbps (Kilobits per second).
Speech compression may be used to reduce the number of bits that represent the speech signal thereby reducing the bandwidth needed for transmission. However, speech compression may result in degradation of the quality of decompressed speech. In general, a higher bit rate will result in higher quality, while a lower bit rate will result in lower quality. However, modern speech compression techniques, such as coding techniques, can produce decompressed speech of relatively high quality at relatively low bit rates. In general, modern coding techniques attempt to represent the perceptually important features of the speech signal, without preserving the actual speech waveform.
One coding technique used to lower the bit rate involves varying the degree of speech compression (i.e. varying the bit rate) depending on the part of the speech signal being compressed. Typically, parts of the speech signal for which adequate perceptual representation is more difficult (such as voiced speech, plosives, or voiced onsets) are coded and transmitted using a higher number of bits. Conversely, parts of the speech for which adequate perceptual representation is less difficult (such as unvoiced, or the silence between words) are coded with a lower number of bits. The resulting average bit rate for the speech signal will be relatively lower than would be the case for a fixed bit rate that provides decompressed speech of similar quality.
Speech compression systems, commonly called codecs, include an encoder and a decoder and may be used to reduce the bit rate of digital speech signals. Numerous algorithms have been developed for speech codecs that reduce the number of bits required to digitally encode the original speech while attempting to maintain high quality reconstructed speech. Code-Excited Linear Predictive (CELP) coding techniques, as discussed in the article entitled “Code-Excited Linear Prediction: High-Quality Speech at Very Low Rates,” by M. R. Schroeder and B. S. Atal, Proc. ICASSP-85, pages 937-940, 1985, provide one effective speech coding algorithm. An example of a variable rate CELP based speech coder is TIA (Telecommunications Industry Association) IS-127 standard that is designed for CDMA (Code Division Multiple Access) applications. The CELP coding technique utilizes several prediction techniques to remove the redundancy from the speech signal. The CELP coding approach is frame-based in the sense that it stores sampled input speech signals into a block of samples called frames. The frames of data may then be processed to create a compressed speech signal in digital form.
The CELP coding approach uses two types of predictors, a short-term predictor and a long-term predictor. The short-term predictor typically is applied before the long-term predictor. A prediction error derived from the short-term predictor is commonly called short-term residual, and a prediction error derived from the long-term predictor is commonly called long-term residual. The long-term residual may be coded using a fixed codebook that includes a plurality of fixed codebook entries or vectors. One of the entries may be selected and multiplied by a fixed codebook gain to represent the long-term residual. The short-term predictor also can be referred to as an LPC (Linear Prediction Coding) or a spectral representation, and typically comprises 10 prediction parameters. The long-term predictor also can be referred to as a pitch predictor or an adaptive codebook and typically comprises a lag parameter and a long-term predictor gain parameter. Each lag parameter also can be called a pitch lag, and each long-term predictor gain parameter can also be called an adaptive codebook gain. The lag parameter defines an entry or a vector in the adaptive codebook.
The CELP encoder performs an LPC analysis to determine the short-term predictor parameters. Following the LPC analysis, the long-term predictor parameters may be determined. In addition, determination of the fixed codebook entry and the fixed codebook gain that best represent the long-term residual occurs. The powerful concept of analysis-by-synthesis (ABS) is employed in CELP coding. In the ABS approach, the best contribution from the fixed codebook, the best fixed codebook gain, and the best long-term predictor parameters may be found by synthesizing them using an inverse prediction filter and applying a perceptual weighting measure. The short-term (LPC) prediction coefficients, the fixed-codebook gain, as well as the lag parameter and the long-term gain parameter may then be quantized. The quantization indices, as well as the fixed codebook indices, may be sent from the encoder to the decoder.
The CELP decoder uses the fixed codebook indices to extract a vector from the fixed codebook. The vector may be multiplied by the fixed-codebook gain, to create a long-term excitation also known as a fixed codebook contribution. A long-term predictor contribution may be added to the long-term excitation to create a short-term excitation that commonly is referred to simply as an excitation. The long-term predictor contribution comprises the short-term excitation from the past multiplied by the long-term predictor gain. The addition of the long-term predictor contribution alternatively can be viewed as an adaptive codebook contribution or as a long-term (pitch) filtering. The short-term excitation may be passed through a short-term inverse prediction filter (LPC) that uses t
Benyassine Adil
Gao Yang
Shlomot Eyal
Su Huan-yu
Thyssen Jes
Conexant Systems Inc.
Farjami & Farjami LLP
Nolan Daniel
To Doris H.
LandOfFree
Codebook tables for encoding and decoding does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Codebook tables for encoding and decoding, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Codebook tables for encoding and decoding will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3109287