Transmission of comfort noise parameters during...

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C455S063100, C455S116000, C704S208000, C704S214000, C704S215000

Reexamination Certificate

active

06816832

ABSTRACT:

FIELD OF THE INVENTION
This invention relates generally to the field of speech communication, and more particularly to discontinuous transmission (DTX) and improving the quality of comfort noise (CN) during discontinuous transmission.
BACKGROUND OF THE INVENTION
Discontinuous transmission is used in mobile communication systems to switch the radio transmitter off during speech pauses. The use of DTX saves power in the mobile station and increases the time required between battery recharging. It also reduces the general interference level and thus improves transmission quality.
However, during speech pauses the background noise which is transmitted with the speech also disappears if the channel is cut off completely. The result is an unnatural sounding audio signal (silence) at the receiving end of the communication.
It is known in the art, instead of completely switching the transmission off during speech pauses, to instead generate parameters that characterize the background noise, and to send these parameters over the air interface at a low rate in Silence Descriptor (SID) frames. These parameters are used at the receive side to regenerate background noise which reflects, as well as possible, the spectral and temporal content of the background noise at the transmit side. These parameters that characterize the background noise are referred to as comfort noise (CN) parameters. The comfort noise parameters typically include a subset of speech coding parameters: in particular synthesis filter coefficients and gain parameters.
It should be noted, however, that in some comfort noise evaluation schemes of some speech codecs, part of the comfort noise parameters are derived from speech coding parameters while other comfort noise parameter(s) are derived from, for example, signals that are available in the speech coder but that are not transmitted over the air interface.
It is assumed in prior-art DTX systems that the excitation can be approximated sufficiently well by spectrally flat noise (i.e., white noise). In prior art DTX systems, the comfort noise is generated in the receiver by feeding locally generated, spectrally flat noise through a speech coder synthesis filter.
Before describing the present invention, it will be instructive to review conventional circuitry and methods for generating comfort noise parameters on the transmit side, and for generating comfort noise on the receive side.
In this regard reference is thus first made to
FIGS. 1
a
-
1
d.
Referring to
FIG. 1
a
, short term spectral parameters
102
are calculated from a speech signal
100
in a Linear Predictive Coding (LPC) analysis block
101
. LPC is a method well known in the prior art. For simplicity, discussed herein is only the case where the synthesis filter has only a short term synthesis filter, it being realized that in most prior art systems, such as in GSM FR, HR and EFR coders, the synthesis filter is constructed as a cascade of a short term synthesis filter and a long term synthesis filter. However, for the purposes of this description a discussion of the long term synthesis filter is not necessary. Furthermore, the long term synthesis filter is typically switched off during comfort noise generation in prior art DTX systems.
The LPC analysis produces a set of short term spectral parameters
102
once for each transmission frame. The frame duration depends on the system. For example, in all GSM channels the frame size is set at 20 milliseconds.
A

(
z
)
=
1
-

i
=
1
M

a

(
i
)

z
-
i
.
(
1
)
The speech signal is fed through an inverse filter
103
to produce a residual signal
104
. The inverse filter is of the form:
The filter coefficients a(i), i=1, . . . , M are produced in the LPC analysis and are updated once for each frame. Interpolation as known in prior art speech coding may be applied in the inverse filter
103
to obtain a smooth change in the filter parameters between frames. The inverse filter
103
produces the residual
104
which is the optimal excitation signal, and which generates the exact speech signal
100
when fed through synthesis filter
1
/A(z)
112
on the receive side (see
FIG. 1
b
). The energy of the excitation sequence is measured and a scaling gain
106
is calculated for each transmission frame in excitation gain calculation block
105
.
The excitation gain
106
and short term spectral coefficients
102
are averaged over several transmission frames to obtain a characterization of the average spectral and temporal content of the background noise. The averaging is typically carried out over four frames for the GSM FR channel to eight frames, as is the case for the GSM EFR channel. The parameters to be averaged are buffered for the duration of the averaging period in blocks
107
a
and
108
a
(see
FIG. 1
d
). The averaging process is carried out in blocks
107
and
108
, and the average parameters that characterize the background noise are thus generated. These are the average excitation gain g
mean
and the average short term spectral coefficients. In modern speech codecs, there are typically 10 short term spectral coefficients (M=10) which are usually represented as Line Spectral Pair (LSP) coefficients f
mean
(i), i=1, . . . , M, as in the GSM EFR DTX system. Although these parameters are typically quantized prior to transmission, the quantization is ignored in this description for simplicity, in that the exact type of quantization that is performed is irrelevant to the teachings of this invention.
Referring briefly to
FIG. 1
d
, it is shown that the averaging blocks
107
and
108
each typically include the respective buffers
107
a
and
108
a
, which output buffered signals
107
b
and
108
b
, respectively, to the averaging blocks.
The computation and averaging of the comfort noise parameters is explained in detail in GSM recommendation: GSM 06.62 “Comfort noise aspects for Enhanced Full Rate (EFR) speech traffic channels”. Also by example, discontinuous transmission is explained in GSM recommendation: GSM 06.81 “Discontinuous Transmission (DTX) for Enhanced Full Rate (EFR) for speech traffic channels”, and voice activity detection (VAD) is explained in GSM recommendation: GSM 06.82 “Voice Activity Detection (VAD) for Enhanced Full rate (EFR) speech channels”. As such, the details of these various functions are not further discussed here.
Referring to
FIG. 1
b
, there is shown a block diagram of a conventional decoder on the receive side that is used to generate comfort noise in the prior art speech communication system. The decoder receives the two comfort noise parameters, the average excitation gain g
mean
and the set of average short term spectral coefficients f
mean
(i) i=1, . . . ,M, and based on the parameters the decoder generates the comfort noise. The comfort noise generation operation on the receive side is similar to speech decoding, except that the parameters are used at a significantly lower rate (e.g., once every 480 milliseconds, as in the GSM FR and EFR channels), and no excitation signal is received from the speech encoder. During speech decoding the excitation on the receive side is obtained from a codebook that contains a plurality of possible excitation sequences, and an index for the particular excitation vector in the codebook is transmitted along with the other speech coding parameters. For a detailed description of speech decoding and the use of codebooks reference can be had to, by example, U.S. Pat. No.: 5,327,519, entitled “Pulse Pattern Excited Linear Prediction Voice Coder”, by Jari Hagqvist, Kari Järvinen, Kari-Pekka Estola, and Jukka Ranta, the disclosure of which is incorporated by reference herein in its entirety.
During comfort noise generation, however, no index to the codebook is transmitted, and the excitation is obtained instead from a random number or excitation (RE) generator
110
. The RE generator
110
generates excitation vectors
114
having a flat spectrum. The excitation vectors
114
are then scaled by the average excitation gain g
mean
in scaling unit

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Transmission of comfort noise parameters during... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Transmission of comfort noise parameters during..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Transmission of comfort noise parameters during... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3325308

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.