Methods and arrangements in a telecommunications system

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S216000, C704S226000

Reexamination Certificate

active

06424942

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a method and an arrangement for telecommunication, in particular for generating background noise and more particularly for generating at least one coefficient, which enables the provision of a typical background noise in the receiver end of a transmission line.
DESCRIPTION OF RELATED ART
In a speech codec for a digital cellular system using source controlled variable bit rates, different bit rates are needed for different input signals. The highest bit rate is needed for speech signals while non-speech signals need a lower bit rate in order to be reproduced well.
Coding of background noise should preferably use as low a bit rate as possible. For spread spectrum systems (e.g. CDMA) a main objective is to reduce the average bit rate and thereby the total system load, and for TDMA systems the objective is a more efficient use of the battery, although system load can also be important.
In digital cellular systems which makes use of DTX (Discontinuous Transmission), the switch to and from the DTX mode is controlled by a voice activity algorithm (executed by a VAD, Voice Activity Detector).
According to the G.729 recommendation of ITU-T, the VAD algorithm makes a voice activity decision every 10 ms in accordance with the frame size of the G.729 speech coder. A set of difference parameters is extracted and used for an initial decision. The parameters are the full band energy, the zero crossing rate and a spectral measure. The long-term averages of the parameters during non-active voice segments follow the changing nature of the background noise. A set of differential parameters is obtained at each frame. These are a difference measure between each parameter and its respective long-term average. The initial voice activity decision is obtained using a piecewise linear decision boundary between each pair of differential parameters. A final voice activity decision is obtained by smoothing the initial decision.
The output of the VAD module is either 1 or 0, indicating the presence or absence of voice activity. If the VAD output is 1, the G.729 speech codec is invoked to code/decode the active voice frames. The G.729 speech codec has a detector, which enables a SID to be transmitted only if required. On the contrary, a codec according to GSMEFR must transmit SID information at predetermined moments. However, if the VAD output is 0, the DTX/CNG algorithms described herein are used to code/decode the non-active voice frames. Traditional speech coders and decoders use comfort noise to simulate the background noise in the non-active voice frame. If the background noise is not stationary, a mere comfort noise insertion does not provide the naturalness of the original background noise. Therefore it is desirable to intermittently send some information about the background noise in order to obtain a better quality when non-active voice frames are detected. The coding efficiency of the non-active voice frames can be achieved by coding the energy of the frame and its spectrum with as few as fifteen bits. These bits are not automatically transmitted whenever there is a non-active voice detection. Rather, the bits are transmitted only when an appreciable change has been detected with respect to the last transmitted non-active voice frame.
At the decoder side, the received bit stream is decoded. If the VAD output is 1, the G.729 decoder is invoked to synthesize the reconstructed active voice frames. If the VAD output is 0, the CNG module is called to reproduce the non-active frames.
When the VAD flags that speech is present the systems works as normal, i.e. the speech coder codes speech and transmits parameters that describe every frame in the speech signal. A frame is often 10 ms or 20 ms long segments of the speech signal.
When the VAD flags that speech is not present then any of the three scenarios below are possible.
1) TDMA system: The transmitter is switched off and is only allowed to transmit a silence descriptor (SID) frame, say once every 20
th
frame that describes the characteristics of the background noise.
2) CDMA system: The transmit power of the transmitter is decreased very much and, as a consequence, the possible bit rate is decreased in order to meet the demand for a low bit rate imposed by the power reduction, as the comfort noise parameter must be encoded with very few bits.
3) Internet based telephony & Voice storage systems: neither of the previous two. The number of transmitted packets is reduced in order to reduce the load on the network or in the case of voice storage, to reduce the storage need on e.g. a storage medium.
Often the signal spectrum and energy are averaged over several frames. However this approach seldom gives any information of the kind of environment in which the other speaker is located when having a conversation as the signal spectrum is averaged.
Another approach is not to average the signal spectrum and energy in order to avoid smearing the signal spectrum and increase the update rate at the cost of fewer bits per update in order to maintain a low average bit rate.
The two estimates are transmitted to the decoder, sometimes at regular intervals or when e.g. the signal spectrum has changed. The important issue is to consume not too many bits. In the decoder the spectrum and the energy estimates are interpolated in order to try to ensure smooth transmissions. As an excitation source to the STP filter, which normally models the signal spectrum, either white noise is used or randomised versions of fixed and adaptive codebooks are used. The term STP means Short Term Predictor, which is a model of the acoustic characteristics of the oral cavity.
U.S. Pat. No. 5,630,016 discloses a noise generating method during voice inactivity intervals. Said method provides background noise for discontinuous transceiver system during periods of voice inactivity. Said method also alleviates annoyance and discomfort to a listener caused by on and off switching artifacts between intermittent periods of voice activity during conversation. The method according to U.S. Pat. No. 5,630,016 does not describe the problem associated with background noise with tonal characteristics. By tonal characteristics is meant the amount of low frequency sinusoids in the input signal. One example of tonal characteristic is engine noise. A way of measuring the tonal characteristics is the maximum long term correlation.
EP-A-0843301 discloses a method for comfort noise generation for digital mobile terminal modifying random excitation by a spectral control filter so that the frequency content of comfort noise and background noise become similar, or causing the transmitter to replace non-noise speech coding parameters with median value parameters. This method provides audio signals having natural sound at the receiver but does not take into consideration the specific problems related to engine noise.
EP-A-0786760 discloses a method for providing comfort noise between speech bursts, which is more pleasing to a listener than without such, but does not take into account the specific problems related with engine noise from e.g. cars and trams.
U.S. Pat. No. 5,487,087 discloses an output fluctuation signal quantiser for digital encoding of e.g. speech, which models both the input signal and its time variation and modifies an error to include a term corresponding to the difference between current and previous input signals, forcing the quantiser to match the input signal fluctuation. It reduces noise e.g. the swirling effect and can be combined with insertion of comfort noise. However the document does not take into consideration the specific problems related to engine noise.
EP-A-0668007 discloses an acoustic signal processing installation for car telephones which determines auto and cross correlation functions for a Wiener filter in order to reduce the noise content in a microphone signal so that the speech quality of output signal is improved. However, this document does not disclose the generation of comfort noise.
SE-B-451938 discloses a speech det

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Methods and arrangements in a telecommunications system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Methods and arrangements in a telecommunications system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and arrangements in a telecommunications system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2879644

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.