Efficient excitation quantization in noise feedback coding...

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S226000, C704S230000

Reexamination Certificate

active

06751587

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to digital communications, and more particularly, to digital coding (or compression) of speech and/or audio signals.
2. Related Art
In speech or audio coding, the coder encodes the input speech or audio signal into a digital bit stream for transmission or storage, and the decoder decodes the bit stream into an output speech or audio signal. The combination of the coder and the decoder is called a codec.
In the field of speech coding, predictive coding is a very popular technique. Prediction of the input waveform is used to remove redundancy from the waveform, and instead of quantizing an input speech waveform directly, a residual signal waveform is quantized. The predictor(s) used in predictive coding can be either backward adaptive or forward adaptive predictors. Backward adaptive predictors do not require any side information as they are derived from a previously quantized waveform, and therefore can be derived at a decoder. On the other hand, forward adaptive predictor(s) require side information to be transmitted to the decoder as they are derived from the input waveform, which is not available at the decoder.
In the field of speech coding, two types of predictors are commonly used. A first type of predictor is called a short-term predictor. It is aimed at removing redundancy between nearby samples in the input waveform. This is equivalent to removing a spectral envelope of the input waveform. A second type of predictor is often referred as a long-term predictor. It removes redundancy between samples further apart, typically spaced by a time difference that is constant for a suitable duration. For speech, this time difference is typically equivalent to a local pitch period of the speech signal, and consequently the long-term predictor is often referred as a pitch predictor. The long-term predictor removes a harmonic structure of the input waveform. A residual signal remaining after the removal of redundancy by the predictor(s) is quantized along with any information needed to reconstruct the predictor(s) at the decoder.
This quantization of the residual signal provides a series of bits representing a compressed version of the residual signal. This compressed version of the residual signal is often denoted the excitation signal and is used to reconstruct an approximation of the input waveform at the decoder in combination with the predictor(s). Generating the series of bits representing the excitation signal is commonly denoted excitation quantization and generally requires the search for, and selection of, a best or preferred candidate excitation among a set of candidate excitations with respect to some cost function. The search and selection require a number of mathematical operations to be performed, which translates into a certain computational complexity when the operations are implemented on a signal processing device. It is advantageous to minimize the number of mathematical operations in order to minimize a power consumption, and maximize a processing bandwidth, of the signal processing device.
Excitation quantization in predictive coding can be based on a sample-by-sample quantization of the excitation. This is referred to as Scalar Quantization (SQ). Techniques for performing Scalar Quantization of the excitation are relatively simple, and thus, the computational complexity associated with SQ is relatively manageable.
Alternatively, the excitation can be quantized based on groups of samples. Quantizing groups of samples is often referred to as Vector Quantization (VQ), and when applied to the excitation, simply as excitation VQ. The use of VQ can provide superior performance to SQ, and may be necessary when the number of coding bits per residual signal sample becomes small (typically less than two bits per sample). Also, VQ can provide a greater flexibility in bit-allocation as compared to SQ, since a fractional number of bits per sample can be used. However, excitation VQ can be relatively complex when compared to excitation SQ. Therefore, there is need to reduce the complexity of excitation VQ as used in a predictive coding environment.
One type of predictive coding is Noise Feedback Coding (NFC), wherein noise feedback filtering is used to shape coding noise, in order to improve a perceptual quality of quantized speech. Therefore, it would be advantageous to use excitation VQ with noise feedback coding, and further, to do so in a computationally efficient manner.
SUMMARY OF THE INVENTION
Summary
The present invention includes efficient methods related to excitation quantization in noise feedback coding, for example, in NFC systems, where the short-term shaping of the coding noise is generalized. The methods are described primarily in Section IX.D and in connection with
FIGS. 21-31
. The methods are based in part on separating an NFC quantization error signal into ZERO-STATE and ZERO-INPUT response contributions. The methods accommodate general shaping of the coding noise while providing an efficient excitation quantization. The present invention provides an efficient method of producing a ZERO-STATE response with the generalized noise shaping.
In an embodiment, the method is performed in a Noise Feedback Coding (NFC) system having a corresponding ZERO-STATE filter structure, the ZERO-STATE filter structure including multiple filters. The method includes: (a) transforming the ZERO-STATE filter structure to a second ZERO-STATE filter structure including only an all-zero filter, the all-zero filter having a filter response substantially equivalent to a filter response of the ZERO-STATE filter structure including multiple filters; and (b) filtering a VQ codevector with the all-zero filter to produce the ZERO-STATE response error vector corresponding to the VQ codevector.
Terminology
Predictor:
A predictor P as referred to herein predicts a current signal value (e.g., a current sample) based on previous or past signal values (e.g., past samples). A predictor can be a short-term predictor or a long-term predictor. A short-term signal predictor (e.g., a short tern speech predictor) can predict a current signal sample (e.g., speech sample) based on adjacent signal samples from the immediate past. With respect to speech signals, such “short-term” predicting removes redundancies between, for example, adjacent or close-in signal samples. A long-term signal predictor can predict a current signal sample based on signal samples from the relatively distant past. With respect to a speech signal, such “long-term” predicting removes redundancies between relatively distant signal samples. For example, a long-term speech predictor can remove redundancies between distant speech samples due to a pitch periodicity of the speech signal.
The phrases “a predictor P predicts a signal s(n) to produce a signal ps(n)” means the same as the phrase “a predictor P makes a prediction ps(n) of a signal s(n).” Also, a predictor can be considered equivalent to a predictive filter that predictively filters an input signal to produce a predictively filtered output signal.
Coding Noise and Filtering Thereof:
Often, a speech signal can be characterized in part by spectral characteristics (i.e., the frequency spectrum) of the speech signal. Two known spectral characteristics include 1) what is referred to as a harmonic fine structure or line frequencies of the speech signal, and 2) a spectral envelope of the speech signal. The harmonic fine structure includes, for example, pitch harmonics, and is considered a long-term (spectral) characteristic of the speech signal. On the other hand, the spectral envelope of the speech signal is considered a short-term (spectral) characteristic of the speech signal.
Coding a speech signal can cause audible noise when the encoded speech is decoded by a decoder. The audible noise arises because the coded speech signal includes coding noise introduced by the speech coding process, for example, by quantizing signals in the encoding process. The coding noise can have spectral character

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Efficient excitation quantization in noise feedback coding... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Efficient excitation quantization in noise feedback coding..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient excitation quantization in noise feedback coding... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3298336

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.