Subframe-based correlation

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06470309

ABSTRACT:

TECHNICAL FIELD OF THE INVENTION
This invention relates to method of correlating portions of an input signal such as used for pitch estimation and voicing.
BACKGROUND OF THE INVENTION
The problem of reliable estimation of pitch and voicing has been a critical issue in speech coding for many years. Pitch estimation is used, for example, in both Code-Excited Linear Predictive (CELP) coders and Mixed Excitation Linear Predictive (MELP) coders. The pitch is how fast the glottis is vibrating. The pitch period is the time period of the waveform and the number of these repeated variations over a time period. In the digital environment the analog signal is sampled producing the pitch period T samples. In the case of the MELP coder we use artificial pulses to produce synthesized speech and the pitch is determined to make the speech sound right. The CELP coder also uses the estimated pitch in the coder. The CELP quantizes the difference between the periods. In the MELP coder, there is a synthetic excitation signal that you use to make synthetic speech which is a mix of pulses for the pulse part of speech and noise for unvoiced part of speech. The voicing analysis is how much is pulse and how much is noise. The degree of voicing correlation is also used to do this. We do that by breaking the signal into frequency bands and in each frequency band we use the correlation at the pitch value in the frequency band as a measure of how voiced that frequency band is. The pitch period is determined for all possible lags or delays where the delay is determined by the pitch back by T samples. In the correlation one looks for the highest correlation value.
Correlation strength is a function of pitch lag. We search that function to find the best lag. For the lag we get a correlation strength which is a measure of the degree that the model fits.
When we get best lag or correlation we get the pitch and we also get correlation strength at that lag which is used for voicing.
For pitch we compute the correlation of the input against itself
C

(
T
)
=

n
-
0
N
-
1

x
n

x
n
-
T
In the prior art this correlation is on a whole frame basis to get the best predictable value or minimum prediction error on a frame basis. The error
E
=

n

(
x
n
-
x
^
n
)
2
where the predicted value {circumflex over (x)}
n
=gx
n−T
(some delayed version T) where g=a scale factor which is also referred to as pitch prediction coefficient
E
=

n

(
x
n
-
gx
n
-
T
)
2
one tries to vary time delay T to find the optimum delay or lag.
It is assumed that in the prior art g and T are constant over the whole frame.
It is known that g and T are not constant over a whole frame.
SUMMARY OF THE INVENTION
In accordance with one embodiment of the present invention, a subframe-based correlation method for pitch and voicing is provided by finding the pitch track through a speech frame that minimizes the pitch-prediction residual energy over the frame assuming that the optimal pitch prediction coefficient will be used for each subframe lag.


REFERENCES:
patent: 5179594 (1993-01-01), Yip et al.
patent: 5253269 (1993-10-01), Gerson et al.
patent: 5495555 (1996-02-01), Swaminathan
patent: 5528727 (1996-06-01), Wang
patent: 5596676 (1997-01-01), Swaminathan et al.
patent: 5621852 (1997-04-01), Lin
patent: 5710863 (1998-01-01), Chen
patent: 5734789 (1998-03-01), Swaminathan et al.
patent: 5778334 (1998-07-01), Ozawa et al.
patent: 5799271 (1998-08-01), Byun et al.
patent: 5924061 (1999-07-01), Shoham
patent: 6014622 (2000-01-01), Su et al.
patent: 6073092 (2000-06-01), Kwon
patent: 6098036 (2000-08-01), Zinser et al.
patent: 6148282 (2000-11-01), Paksoy et al.
patent: 6151571 (2000-11-01), Pertrushin
patent: 0955627 (1999-10-01), None
Kim, “Adaptive Encoding of Fixed Codebook in CELP Coders”, 1998 IEEE, pp 149-152.*
Oshikiri et al, “A 2.4 kbps Variable bit rate adp-celp speech coder”, pp 517-520, 6/98, IEEE.*
Ojala, “Toll Quality Variable Rate Speech Codec”, pp 747-750, 1997 IEEE.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Subframe-based correlation does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Subframe-based correlation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Subframe-based correlation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2994840

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.