Bi-directional pitch enhancement in speech coding systems

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S219000

Reexamination Certificate

active

06704701

ABSTRACT:

BACKGROUND
1. Technical Field
The present invention relates generally to speech coding; and, more particularly, it relates to low bit rate speech coding systems that employ pitch enhancement to improve the perceptual quality of reproduced speech.
2. Description of Related Art
Conventional speech coding systems typically employ only forward pitch enhancement in code-excited linear prediction speech coding systems. This is largely due to the fact that the sub-frame size of conventional speech codecs, having relatively large bandwidth availability, can provide sufficient perceptual quality with forward pitch enhancement alone. However, for lower bit rates within various communication media employed in speech coding systems, the perceptual quality of reproduced speech, after synthesis, fails to maintain a high perceptual quality.
For conventional speech coding systems that operate at these decreased bit rates, the pitch lag, that is generated during pitch prediction, is commonly much shorter than the overall subframe size, i.e., it covers a relatively small portion of the overall sub-frame. This characteristic is more accentuated for those speakers having a higher (shorter) pitch, such as females and children. Traditional excitation codebook structures do not afford a sufficient high perceptual quality when operating at low bit rates. This is primarily because the periodicity of the voiced signal is not sufficiently established, or the excitation vector extracted from the codebook is insufficiently rich to generate a synthesized speech signal having a high perceptual quality.
As the sub-frame size of speech coding systems becomes larger, as is commonly associated with communication systems that have decreasing bit rates, the fact that pitch enhancement is performed in only the forward direction results in significantly poorer perceptual quality. This is due, among other reasons, to the fact that there is a significant amount of dead space in the sub-frame due to the absence of many pulses. In conventional speech coding systems that operate at higher bit rate, having consequently shorter sub-frames, this effect is not typically audibly perceived by the human ear. This effect of lower perceptual quality is realized in nearly all speech coding systems that deal with speech coding having relatively low available bit rates.
Further limitations and disadvantages of conventional and traditional systems will become apparent to one of skill in the art through comparison of such systems with the present invention as set forth in the remainder of the present application with reference to the drawings.
SUMMARY OF THE INVENTION
Various aspects of the present invention can be found in a speech coding system that employs forward pitch enhancement and backward pitch enhancement. In certain embodiments of the invention, the forward pitch enhancement and the backward pitch enhancement are performed in a single portion of the entire speech coding system. For example, in speech coding systems having a speech codec, wherein the speech codec contains an encoder and a decoder, the forward pitch enhancement and the backward pitch enhancement are performed in both the encoder and the decoder of the speech codec. Alternatively, in other embodiments of the invention, the forward pitch enhancement and the backward pitch enhancement are performed only in the decoder of the speech codec. As determined by the specific application, the forward pitch enhancement and the backward pitch enhancement are performed in a distributed manner, each being performed, at least in part, in each one of the encoder and the decoder of the speech codec.
In certain embodiments of the invention, the backward pitch enhancement is generated using the forward pitch enhancement itself. The backward pitch enhancement is a mirror image of the forward pitch enhancement that is previously generated; the backward pitch enhancement is generated dependent on the forward pitch enhancement. Alternatively, in other embodiments of the invention, the backward pitch enhancement is generated independent of the forward pitch enhancement; the backward pitch enhancement is generated irrespective of the forward pitch enhancement that has previously been generated.
The speech coding system, built in accordance with the present invention, is appropriately geared toward those speech coding systems that operate using communication media having limited or constrained bandwidth availability. Any communication media may be employed within in the invention, without departing from the scope and spirit thereof. Examples of such communication media include, but are not limited to, wireless communication media, wire-based telephonic communication media, fiber-optic communication media, and ethernet.


REFERENCES:
patent: 5528727 (1996-06-01), Wang
patent: 5774837 (1998-06-01), Yeldener et al.
patent: 5890108 (1999-03-01), Yeldener
patent: 5899967 (1999-05-01), Nagasaki
patent: 6161086 (2000-12-01), Mukherjee et al.
patent: 6240386 (2001-05-01), Thyssen et al.
patent: 6385576 (2002-05-01), Amada et al.
patent: 6556966 (2003-04-01), Gao
patent: 6574593 (2003-06-01), Gao et al.
patent: 6581032 (2003-06-01), Gao et al.
patent: 6604070 (2003-08-01), Gao et al.
Yang et al., “Voiced speech coding at very low bit rates based on forward-backward waveform prediction,” IEEE Transactions on Speech and Audio Processing, Jan. 1995, vol. 3, pp. 40 to 47.*
Pettigrew et al., “Backward pitch prediction for low-delay speech coding,” IEEE Global Telecommunications Conference, 1989, and Exhibition. Communications Technology for the 1990s and Beyond, Nov. 1989, vol. 2, pp. 1247 to 1252.*
V. Cuperman, “Low delay speech coding,” 1991 Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems and Computers, Nov. 1991, vol. 2, pp. 935 to 939.*
International Telecommunication Union (Telecommunication Standardization Sector of ITU), “General Aspects of Digital Transmission System. Coding of Speech at 8 kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP),” ITU-T Recommendation G.729, pp. 1-35, 1996.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Bi-directional pitch enhancement in speech coding systems does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Bi-directional pitch enhancement in speech coding systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Bi-directional pitch enhancement in speech coding systems will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3233868

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.