Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2001-04-26
2003-11-04
McFadden, Susan (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S226000, C370S289000
Reexamination Certificate
active
06643618
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a speech decoding unit and a speech decoding method for reproducing far-end talker background noise when detecting speech pauses that do not contain speech of a far-end talker.
2. Description of Related Art
FIG. 1
is a block diagram showing a configuration of a conventional speech decoding unit disclosed in Japanese patent application laid-open No. 7-129195/1995, for example. In this figure, the reference numeral
1
designates an input terminal for inputting a speech code sequence;
2
designates an excitation signal generator for generating an excitation signal from the speech code sequence;
3
designates a speech spectrum coefficient generator for generating speech spectrum coefficients from the speech code sequence;
4
designates a synthesis filter for reproducing a speech signal from the excitation signal generated by the excitation signal generator
2
and the speech spectrum coefficients generated by the speech spectrum coefficient generator
3
;
5
designates a speech spectrum coefficient buffer for holding the speech spectrum coefficients generated by the speech spectrum coefficient generator
3
;
6
designates a speech spectrum coefficient interpolator for carrying out linear interpolation of the speech spectrum coefficients during speech pauses;
7
designates a speech output circuit for supplying the speech signal reproduced by the synthesis filter
4
to an output terminal
8
; and
8
designates the output terminal.
Next, the operation of the conventional speech decoding unit will be described.
First, when a speech coder (not shown) detects speech of a far-end talker, it encodes the speech, and transmits the speech code sequence to the speech decoding unit.
When the speech of the far-end talker interrupts, the speech coder detects the speech pause of the far-end talker with an internal VOX (voice operated transmitter), and halts the transmission of the speech code sequence to the speech decoding unit. Instead, the speech coder transmits a unique word (post-amble POST) indicating the start of the speech pause and coding parameters indicating far-end talker background noise information.
During a speech burst in which the speech of the far-end talker is detected, the speech coder transmits the speech code sequence, so that in the speech decoding unit, the excitation signal generator
2
generates the excitation signal from the speech code sequence, and the speech spectrum coefficient generator
3
generates the speech spectrum coefficients from the speech code sequence.
When the speech burst begins because of the transition from the speech pause to the speech burst, the speech coder transmits a unique word called a preamble PRE so that the speech decoding unit can detect the start of the speech burst by detecting the unique word.
When the excitation signal generator
2
generates the excitation signal and the speech spectrum coefficient generator
3
generates the speech spectrum coefficients, the synthesis filter
4
reproduces the speech signal from the excitation signal and speech spectrum coefficients.
Then, the speech output circuit
7
supplies the speech signal reproduced by the synthesis filter
4
to the output terminal
8
.
On the other hand, during the speech pause in which the speech of the far-end talker is not detected, although the speech coder halts the transmission of the speech code sequence, it transmits a unique word (post-amble POST) indicating the start of the speech pause, followed by the coding parameters indicating the far-end talker background noise information, so that in the speech decoding unit, the speech spectrum coefficient generator
3
generates the speech spectrum coefficients from the coding parameters indicating the far-end talker background noise information, and the excitation signal generator
2
continuously generates the excitation signal from the speech code sequence received in the final receiving period of the speech burst.
When the speech pause begins because of the transition from the speech burst to speech pause, since the speech coder transmits the unique word called a post-amble POST as described above, the speech decoding unit can detect the start of the speech pause by detecting the unique word (see, FIG.
2
).
When the speech pause is detected, the synthesis filter
4
reproduces the speech signal from the excitation signal generated by the excitation signal generator
2
and from the far-end talker background noise information (speech spectrum coefficients) generated by the speech spectrum coefficient generator
3
. However, if there is an acute difference between the far-end talker background noise information and the speech code sequence received in the final receiving period of the preceding speech burst, the reproduced speech signal varies sharply, thereby presenting a problem of reproducing uncomfortable background noise to the near-end listener.
In view of this, when the speech pause is detected, the speech spectrum coefficient interpolator
6
carries out linear interpolation of the speech spectrum coefficients (see, ☆ mark of FIG.
2
), that is, the far-end talker background noise information received after the post-amble POST as shown in FIG.
2
.
More specifically, if the synthesis filter
4
reproduces the speech signal using the far-end talker background noise information from the very beginning of the speech pause, the speech signal can change abruptly at the transition from the speech burst to the speech pause. Thus, to gradually vary the speech signal from the beginning of the speech pause to the update of the far-end talker background noise information (at the time when the next far-end talker background noise information is transmitted), a constant is added stepwise to the speech code sequence received in the final receiving period of the speech burst (the speech spectrum coefficients held in the speech spectrum coefficient buffer
5
) to update the speech code sequence at fixed interpolation intervals (linearly increasing or decreasing the speech code sequence).
Using the far-end talker background noise information (speech spectrum coefficients) passing through the linear interpolation, the synthesis filter
4
reproduces the speech signal so that the speech output circuit
7
supplies the speech signal to the output terminal
8
.
With the foregoing arrangement, the conventional speech decoding unit linearly interpolates the background noise information when the speech pause is detected, so as to vary the speech signal gradually. However, since the interpolation interval of the far-end talker background noise information is fixed at every frame interval, this presents a problem in that a near-end listener feels variations in the reproduced background noise to be monotonous and uncomfortable.
The present invention is implemented to solve the foregoing problem. Therefore, an object of the present invention is to provide a speech decoding unit and a speech decoding method capable of reproducing background noise with little uncomfortable feeling to the near-end listener.
SUMMARY OF THE INVENTION
The speech decoding unit in accordance with the present invention estimates coding parameters of a speech pause by carrying out a smoothing algorithm using coding parameters constituting far-end talker background noise information extracted by an extracting means and coding parameters that are used for synthesizing previous background noise.
This offers an advantage of being able to reproduce background noise with little uncomfortable feeling.
The speech decoding unit in accordance with the present invention can comprise an estimating means for estimating the coding parameters of the speech pause by substituting, into a prescribed equation, the coding parameters that are the far-end talker background noise information and the coding parameters that are used for synthesizing the previous background noise.
This offers an advantage of being able to carry out the smoothing algorithm of the coding parameters quickly
Matsuoka Bunkei
Tasaki Hirohisa
McFadden Susan
Mitsubishi Denki & Kabushiki Kaisha
Rothwell Figg Ernst & Manbeck
LandOfFree
Speech decoding unit and speech decoding method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech decoding unit and speech decoding method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech decoding unit and speech decoding method will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3176497