Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2002-03-22
2004-11-02
To, Doris H. (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
Reexamination Certificate
active
06813602
ABSTRACT:
CD-ROM COMPUTER PROGRAM LISTING APPENDIX
A CD-ROM appendix is included in this disclosure. Specifically, Appendix B is a plurality of tables utilized by the computer source code listing. The CD-ROM is submitted at the same time as this preliminary amendment, and is hereby incorporated by reference. The only file on the CD-ROM is entitled, “10932-43 CD-ROM Appendix.” The file size is 790 KB and the file was created on Nov. 27, 2001. The machine format is IBM-PC and the operating system used to create the file is MS-Windows.
BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates generally to speech encoding and decoding in voice communication systems; and, more particularly, it relates to various techniques used with code-excited linear prediction coding to obtain high quality speech reproduction through a limited bit rate communication channel.
2. Related Art
Signal modeling and parameter estimation play significant roles in communicating voice information with limited bandwidth constraints. To model basic speech sounds, speech signals are sampled as a discrete waveform to be digitally processed. In one type of signal coding technique called LPC (linear predictive coding), the signal value at any particular time index is modeled as a linear function of previous values. A subsequent signal is thus linearly predictable according to an earlier value. As a result, efficient signal representations can be determined by estimating and applying certain prediction parameters to represent the signal.
Applying LPC techniques, a conventional source encoder operates on speech signals to extract modeling and parameter information for communication to a conventional source decoder via a communication channel. Once received, the decoder attempts to reconstruct a counterpart signal for playback that sounds to a human ear like the original speech.
A certain amount of communication channel bandwidth is required to communicate the modeling and parameter information to the decoder. In embodiments, for example where the channel bandwidth is shared and real-time reconstruction is necessary, a reduction in the required bandwidth proves beneficial. However, using conventional modeling techniques, the quality requirements in the reproduced speech limit the reduction of such bandwidth below certain levels.
Speech encoding becomes increasingly difficult as transmission bit rates decrease. Particularly for noise encoding, perceptual quality diminishes significantly at lower bit rates. Straightforward code-excited linear prediction (CELP) is used in many speech codecs, and it can be very effective method of encoding speech at relatively high transmission rates. However, even this method may fail to provide perceptually accurate signal reproduction at lower bit rates. One such reason is that the pulse like excitation for noise signals becomes more sparse at these lower bit rates as less bits are available for coding and transmission, thereby resulting in annoying distortion of the noise signal upon reproduction.
Many communication systems operate at bit rates that vary with any number of factors including total traffic on the communication system. For such variable rate communication systems, the inability to detect low bit rates and to handle the coding of noise at those lower bit rates in an effective manner often can result in perceptually inaccurate reproduction of the speech signal. This inaccurate reproduction could be avoided if a more effective method for encoding noise at those low bit rates were identified.
Additionally, the inability to determine the optimal encoding mode for a given noise signal at a given bit rate also results in an inefficient use of encoding resources. For a given speech signal having a particular noise component, the ability to selectively apply an optimal coding scheme at a given bit rate would provide more efficient use of an encoder processing circuit. Moreover, the ability to select the optimal encoding mode for type of noise signal would further maximize the available encoding resources while providing a more perceptually accurate reproduction of the noise signal.
SUMMARY OF THE INVENTION
A random codebook is implemented utilizing overlap in order to reduce storage space. This arrangement necessitates reference to a table or other index that lists the energies for each codebook vector. Accordingly, the table or other index, and the respective energy values, must be stored, thereby adding computational and storage complexity to such a system.
The present invention re-uses each table codevector entry in a random table with “L” codevectors, each of dimension “N.” That is, for example, an exemplary codebook contains codevectors V
0
, V
1
, . . . , V
L
, with each codevector V
x
being of dimension N and having elements C
0
, C
1
, . . . , C
N-1
, C
N
. Each codevector of dimension N is normalized to an energy value of unity, thereby reducing computational complexity to a minimum.
Each codebook entry essentially acts as a circular buffer whereby N different random codebook vectors are generated by specifying a starting point at each different element in a given codevector. In one embodiment, each of the different N codevectors then has unity energy.
The dimension of each table entry is identical to the dimension of the required random codevector and every element in a particular table entry will be in any codevector derived from this table entry. This arrangement dramatically reduces the necessary storage capacity of a given system, while maintaining minimal computational complexity.
Other aspects, advantages and novel features of the present invention will become apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings.
REFERENCES:
patent: 5223660 (1993-06-01), Wahlgreen
patent: 5293449 (1994-03-01), Tzeng
patent: 5307441 (1994-04-01), Tzeng
patent: 5323486 (1994-06-01), Taniguchi et al.
patent: 5396576 (1995-03-01), Miki et al.
patent: 5414796 (1995-05-01), Jacobs et al.
patent: 5451951 (1995-09-01), Elliott et al.
patent: 5657420 (1997-08-01), Jacobs et al.
patent: 5734789 (1998-03-01), Swaminathan et al.
patent: 5778338 (1998-07-01), Jacobs et al.
patent: 5826226 (1998-10-01), Ozawa
patent: 5899968 (1999-05-01), Navarro et al.
patent: 6055496 (2000-04-01), Heidari et al.
patent: 6424945 (2002-07-01), Sorsa
patent: 6480822 (2002-11-01), Thyssen
patent: 0 515 138 (1992-11-01), None
patent: 0 788 091 (1997-08-01), None
patent: 0 834 863 (1998-04-01), None
W. Bastiaan Kleijn and Peter Kroon, “The RCELP Speech-Coding Algorithm,” vol. 5, No. 5, Sep.-Oct. 1994, pp. 39/573-47/581.
C. Laflamme, J-P. Adoul, H.Y. Su, and S. Morissette, “On Reducing Computational Complexity of Codebook Search in CELP Coder Through the Use of Algebraic Codes,” 1990, pp. 177-180.
Chih-Chung Kuo, Fu-Rong Jean, and Hsiao-Chuan Wang, “Speech Classification Embedded in Adaptive Codebook Search for Low Bit-Rate CELP Coding,” IEEE Transactions on Speech and Audio Processing, vol. 3, No. 1, Jan. 1995, pp. 1-5.
Erdal Paksoy, Alan McCree, and Vish Viswanathan, “A Variable-Rate Multimodal Speech Coder with Gain-Matched Analysis-By-Synthesis,” 1997, pp. 751-754.
Gerhard Schroeder, “International Telecommunication Union Telecommunications Standardization Sector,” Jun. 1995, pp. i-iv, 1-42.
“Digital Cellular Telecommunications System; Comfort Noise Aspects for Enhanced Full Rate (EFR) Speech Traffic Channels (GSM 06.62),” May 1996, pp. 1-16.
W.B. Kleijn and K.K. Paliwal (Editors), Speech Coding and Synthesis, Elsevier Science B.V.; Kroon and W.B. Kleijn (Authors), Chapter 3: “Linear-Prediction Based on Analysis-By-Synthesis Coding”, 1995, pp. 81-113.
W.B. Kleijn and K.K. Paliwal (Editors), Speech Coding and Synthesis, Elsevier Science B.V.; A. Das, E. Paskoy and A. Gersho (Authors), Chapter 7: “Multimode and Variable-Rate Coding of Speech,” 1995, pp. 257-288.
B.S. Atal, V. Cuperman, and A. Gersho (Editors), Speech and Audio Coding for Wireless and Network Applications, Kluwer Academic Publishers; T. Taniguchi, T. Tanaka
Farjami & Farjami LLP
Mindspeed Technologies Inc.
Opsasnick Michael N.
To Doris H.
LandOfFree
Methods and systems for searching a low complexity random... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods and systems for searching a low complexity random..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and systems for searching a low complexity random... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3360961