Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1998-10-14
2001-11-06
Korzuch, William (Department: 2741)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S228000, C704S226000
Reexamination Certificate
active
06314395
ABSTRACT:
CROSS-REFERENCE TO RELATED APPLICATION
This application claims the priority benefit of Taiwan application serial no. 86115188, filed Oct. 16, 1997, the full disclosure of which is incorporated herein by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to voice signal processing techniques, and more particularly, to a voice detection method and apparatus which can detect whether a received signal is a voice signal or a background noise. In the invention, the voice detection does not to perform multiplications and divisions so that the hardware complexity and cost for implementation can be significantly reduced.
2. Description of Related Art
Voice detection is a signal processing technique used to determine whether a received signal is a voice signal or a background noise and if a voice signal is detected, the begin point and the end point of the voice signal is determined. One conventional method to achieve this purpose is to compare the mean and standard deviation of the energy of the received signal and also the zero-crossing rate of the same with preset values. The comparison result then indicates whether the received signal is a voice signal or a background noise; and if a voice signal, the begin point and end point of the voice signal are also determined.
Fundamentally, the energy of a voice signal can be obtained from the following equation:
E
⁡
(
i
)
=
SQRT
⁢
{
[
∑
n
=
0
n
=
M
-
1
⁢
⁢
X
⁡
(
n
)
×
X
⁡
(
n
)
]
÷
M
}
(
A1
)
where
E(i) is the energy of the (i)th frame of the digitized voice signal;
SQRT is a square-root operator;
M is the total number of sampling points in each frame; and
X(n) is the digitized data from the (n)th sampling point in the (i)th frame.
The foregoing equation is too complex to perform. The following less complex equation can be used instead to compute for E(i):
E
⁡
(
i
)
=
[
∑
n
=
0
n
=
M
-
1
⁢
&LeftBracketingBar;
X
⁡
(
n
)
&RightBracketingBar;
]
÷
M
(
A2
)
Therefore, it requires M-1 additions and one division to perform the operation of Eq. (A
2
) to obtain the value of E(i). In the case of using a sampling frequency of 8 kHz (sampling period=0.125 ms) to digitize the voice signal into 8-bit digital signal, then M=160 for a frame length of 20 ms, which requires 159 additions and one division to obtain the value of E(i). The hardware needed to perform this operation is therefore quite complex. Moreover, in order to prevent overflow, an accumulator of a large bit length should be used. This further increase the complexity of the hardware needed to implement the conventional voice detection method.
To make the products of voice detection apparatuses more competitive on the market, the manufacturing cost should be down. One conventional voice detection method and apparatus utilizes an accumulator of a large bit length and a preemphasis circuit that involves multiplication operations. This voice detection apparatus is therefore quite complex in hardware architecture and thus high in manufacturing cost. Another conventional voice detection method and apparatus utilizes a cascaded series of registers to implement the large bit-length accumulator. One drawback to this scheme, however, is that it would cause a degrade to the system performance and throughput and an increased degree of complexity in programming. There exists, therefore, a need for a new voice detection method and apparatus, which can be implemented with less complex hardware circuitry.
SUMMARY OF THE INVENTION
It is therefore an objective of the present invention to provide a voice detection method and apparatus which performs no complex multiplications and divisions and uses 8-bit registers but can nonetheless provide good voice detection result and prevent overflow of data during computation.
It is another an objective of the present invention to provide a voice detection method and apparatus which is less complex in hardware architecture compared to the prior art, so that manufacturing cost can be reduced.
It is still another objective of the present invention to provide a voice detection method and apparatus which allows easy refreshing of the preset threshold of background noise.
In accordance with the foregoing and other objectives of the present invention, a voice detection method and apparatus is provided. The voice detection method and apparatus is used in particular to detect whether a received analog signal is a voice signal.
By the voice detection method of the invention, the initial steps are to digitize the received analog signal into digital form, and then preemphasize the digital form of the received analog signal so as to intensify the high-frequency components of the voice signal that can be attenuated during transmission through the air. A preemphasized digital signal is thus obtained, which is then divided into a plurality of frames, each frame containing a specific number of sampling points of data.
The subsequent steps are to count for the total number of occurrences of each of the absolute discrete amplitude levels in each of the frames in the preemphasized digital signal, and then find the majority magnitude of each of the frames in preemphasized digital signal.
Subsequently, the majority magnitude of each of the frames is compared with a preset threshold of background noise in a following manner. If a predetermined number of consecutive frames are all greater in majority magnitude than the threshold of background noise, then a begin/end signal is switched to an enable state. Otherwise, the begin/end signal is maintained at a disable state.
If the predetermined number of consecutive frames is not all greater in majority magnitude than the threshold of background noise, then a threshold refreshing procedure is performed. Otherwise, after begin/end signal is switched to the enable state, the subsequent steps are to pause for a period of a specific number of frames, and then compare the majority magnitude of each of subsequently received frames with the preset threshold of background noise in a following manner. If a predetermined number of consecutive frames are not all greater in majority magnitude than the threshold of background noise, then the begin/end signal is switched to the disable state. Otherwise, the begin/end signal is maintained at the enable state.
The above-described voice detection method can be used for detecting the begin point and end point of a voice signal, which needs no complex multiplication and divisions as in the prior art to perform the computations for the voice detection.
According to the above-described voice detection method, the high-frequency and low-amplitude components of the voice signal can be preemphasized, so as to prevent the loss of fidelity of the voice signal. The preemphasized signal is then processed by the majority-magnitude detecting circuit to obtain the majority magnitude of each of the frames in the voice signal. This allows the overall voice detection method to be reduced in hardware complexity.
In the foregoing method, the preemphasizing is performed in accordance with the equation:
y
(
n
)=
x
(
n
)
−&agr;·x
(
n=
1)
where y(n) is the (n)th output preemphasized digital signal, x(n) is the sampled digital data from the (n)th sampling point; and &agr; is a predetermined preemphasizeer factor.
Further, the threshold refreshing procedure is performed in accordance with the equation to obtain a refreshed new threshold of background noise:
New_Threshold=Old_Threshold+
b×
(Majority_Magnitude−Old_Threshold)
where
New_Threshold is the refreshed new threshold of background noise; Old_Threshold is the previously set threshold of background noise; Majority_Magnitude is the majority magnitude of the currently received frame; and b is a predetermined constant.
The invention further provides a voice detection apparatus for detecting whether a digital signal converted from an analog input is a voice signal. The voice detection apparatus of the invention includes a preemphasis c
Chawan Vijay B
Huang Jiawei
J. C. Patents
Korzuch William
Winbond Electronics Corp.
LandOfFree
Voice detection apparatus and method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Voice detection apparatus and method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Voice detection apparatus and method will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2594599