Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2001-10-19
2003-12-16
Chawan, Vijay (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S205000, C704S207000, C704S208000, C704S209000, C704S200100, C704S268000, C704S220000
Reexamination Certificate
active
06665637
ABSTRACT:
THE BACKGROUND OF THE INVENTION AND PRIOR ART
The present invention relates generally to the concealment of errors in decoded acoustic signals caused by encoded data representing the acoustic signals being partially lost or damaged. More particularly the invention relates to a method of receiving data in the form of encoded information from a transmission medium and an error concealment unit according to the preambles of claims 1 and 39 respectively. The invention also relates to decoders for generating an acoustic signal from received data in the form of encoded information according to the preambles of claims 41 and 42 respectively, a computer program according to claim 37 and a computer readable medium according to claim 38.
There are many different applications for audio and speech codecs (codec=coder and decoder). Encoding and decoding schemes are, for instance, used for bit-rate efficient transmission of acoustic signals in fixed and mobile communications systems and in videoconferencing systems. Speech codecs can also be utilised in secure telephony and for voice storage.
Particularly in mobile applications, the codecs occasionally operate under adverse channel conditions. One consequence of such non-optimal transmission conditions is that encoded bits representing the speech signal are corrupted or lost somewhere between the transmitter and the receiver. Most of the speech codecs of today's mobile communication systems and Internet applications operate block-wise, where GSM (Global System for Mobile communication), WCDMA (Wideband Code Division Multiple Access), TDMA (Time Division Multiple Access) and IS95 (International Standard-95) constitute a few examples. The block-wise operation means that an acoustic source signal is divided into speech codec frames of a particular duration, e.g. 20 ms. The information in a speech codec frame is thus encoded as a unit. However, usually the speech codec frames are further divided into sub-frames, e.g. having a duration of 5 ms. The sub-frames are then the coding units for particular parameters, such as the encoding of a synthesis filter excitation in the GSM FR-codec (FR=Full Rate), GSM EFR-codec (EFR=Enhanced Full Rate), GSM AMR-codec (AMR=Adaptive Multi Rate), ITU G.729-codec (ITU=International Telecommunication Union) and EVRC (Enhanced Variable Rate Codec).
Besides the excitation parameters, the above codecs also model acoustic signals by means of other parameters like, for instance, LPC-parameters (LPC=Linear Predictive Coding), LTP-lag (LTP=Long Term Prediction) and various gain parameters. Certain bits of these parameters represent information that is highly important with respect to the perceived sound quality of the decoded acoustic signal. If such bits are corrupted during the transmission the sound quality of the decoded acoustic signal will, at least temporarily, be perceived by a human listener as having a relatively low quality. It is therefore often advantageous to disregard the parameters for the corresponding speech codec frame if they arrive with errors and instead make use of previously received correct parameters. This error concealment technique is applied, in form or the other, in most systems through which acoustic signals are transmitted by means of non-ideal channels.
The error concealment method normally aims at alleviating the effects of a lost/damaged speech codec frame by freezing any speech codec parameters that vary comparatively slowly. Such error concealment is performed, for instance, by the error concealment unit in the GSM EFR-codec and GSM AMR-codec, which repeats the LPC-gain and the LPC-lag parameters in case of a lost or damaged speech codec frame. If, however, several consecutive speech codec frames are lost or damaged various muting techniques are applied, which may involve repetition of gain parameters with decaying factors and repetition of LPC-parameters moved towards their long-term averages. Furthermore, the power level of the first correctly received frame after reception of one or more damaged frames may be limited to the power level of the latest correctly received frame before reception of the damaged frame(s). This mitigates undesirable artefacts in the decoded speech signal, which may occur due to the speech synthesis filter and adaptive codebook being set in erroneous states during reception of the damaged frame(s).
Below is referred to a few examples of alternative means and aspects of ameliorating the adverse effects of speech codec frames being lost or damaged during transmission between a transmitter and a receiver.
The U.S. Pat. No. 5,907,822 discloses a loss tolerant speech decoder, which utilises past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. A multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression parameters extracts the necessary parameters in case of a lost frame and produces a replacement frame.
The European patent, B1, 0 665 161 describes an apparatus and a method for concealing the effects of lost frames in a speech decoder. The document suggests the use of a voice activity detector to restrict updating of a threshold value for determining background sounds in case of a lost frame. A post filter normally tilts the spectrum of a decoded signal. However, in case of a lost frame the filtering coefficients of the post filter are not updated.
The U.S. Pat. No. 5,909,663 describes a speech coder in which the perceived sound quality of a decoded speech signal is enhanced by avoiding a repeated use of the same parameter at reception of several consecutive damaged speech frames. Adding noise components to an excitation signal, substituting noise components for the excitation signal or reading an excitation signal at random from a noise codebook containing plural excitation signals accomplishes this.
The known error concealment solutions for narrow-band codecs generally provide a satisfying result in most environments by simply repeating certain spectral parameters from a latest received undamaged speech codec frame during the corrupted speech codec frame(s). In practice, this procedure implicitly retains the magnitude and the shape of the spectrum of the decoded speech signal until a new undamaged speech codec frame is received. By such preservation of the speech signal's spectral magnitude and the shape, it is also implicitly assumed that an excitation signal in the decoder is spectrally flat (or white).
However, this is not always the case. An Algebraic Code Excited Linear Predictive-codec (ACELP) may, for instance, produce non-white excitation signals. Furthermore, the spectral shape of the excitation signal may vary considerably from one speech codec frame to another. A mere repetition of spectral parameters from a latest received undamaged speech codec frame could thus result in abrupt changes in the spectrum of the decoded acoustic signal, which, of course, means that a low sound quality is experienced.
Particularly, wide-band speech codecs operating according to the CELP coding paradigm have proven to suffer from the above problems, because in these codecs the spectral shape of the synthesis filter excitation may vary even more dramatically from one speech codec frame to another.
SUMMARY OF THE INVENTION
The object of the present invention is therefore to provide a speech coding solution, which alleviates the problem above.
According to one aspect of the invention the object is achieved by a method of receiving data in the form of encoded information and decoding the data into an acoustic signal as initially described, which is characterised by, in case of received damaged data, producing a secondary reconstructed signal on basis of a primary reconstructed signal. The secondary reconstructed signal has a spectrum, which is a spectrally adjusted version of the spectrum of the primary reconstructed signal where the deviation with respect to spectral shape to a s
Chawan Vijay
Telefonaktiebolaget LM Ericsson (publ)
LandOfFree
Error concealment in relation to decoding of encoded... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Error concealment in relation to decoding of encoded..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Error concealment in relation to decoding of encoded... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3153493