Speech decoding using mix ratio table

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S208000

Reexamination Certificate

active

06377915

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to speech coding and decoding method for encoding and decoding a speech signal at a low bit rate, and relates to speech coding and decoding apparatus capable of encoding and decoding a speech signal at a low bit rate.
The low bit rate speech coding system conventionally known is 2.4 kbps LPC (i.e., Linear Predictive Coding) or 2.4 kbps MELP (i.e., Mixed Excitation Linear Prediction). Both of these coding systems are the speech coding systems in compliance with the United States Federal Standard. The former is already standardized as FS-1015. The latter is selected in 1997 and standardized as a sound quality improved version of FS-1015.
The following references relate to at least either of 2.4 kbps LPC system and 2.4 kbps MELP system.
[1] FEDERAL STANDARD 1015, “ANALOG TO DIGITAL CONVERSATION OF VOICE BY 2,400 BIT/SECOND LINEAR PREDICTIVE CODING,” Nov. 28, 1984
[2] Federal Information Processing Standards publication, “Analog to Digital Conversation of Voice by 2,400 Bit/Second Mixed Excitation Linear Prediction,” May 28, 1998 Draft
[3] L. Supplee, R. Cohn, J. Collura and A. McCree, “MELP: The new federal standard at 2,400 bps,” Proc. ICASSP, pp.1591-1594, 1997
[4] A. McCree and T. Barnwell III, “A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding,” IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 3, No. 4, pp.242-250, July 1995
[5] D. Thomson and D. Prezas, “SELECTIVE MODELING OF THE LPC RESIDUAL DURING UNVOICED FRAMES: WHITE NOISE OR PULSE EXCITATION,” Proc. ICASSP, pp.3087-3090, 1986
[6] Seishi Sasaki and Masayasu Miyake, “Decoder for a Linear Predictive Analysis/synthesis System,” Japanese Patent No. 2,711,737 corresponding to the first Japanese Patent Publication No. 03-123,400 published on May 27, 1991.
First, the principle of 2.4 kbps LPC system will be explained with reference to
FIGS. 18 and 19
(details of the processing can be found in the above reference [1]).
FIG. 18
is a block diagram showing the circuit arrangement of an LPC type speech encoder. A framing unit
11
is a buffer which stores an input speech sample al having being bandpass-limited to the frequency range of 100-3,600 Hz and sampled at the frequency of 8 kHz and then quantized to the accuracy of at least 12 bits. The framing unit
11
fetches the speech samples (180 samples) for every single speech coding frame (22.5 ms), and sends an output b
1
to a speech coding processing section.
Hereinafter, the processing performed for every single speech coding frame will be explained.
A pre-emphasis unit
12
processes the output b
1
of the framing unit
11
to emphasize the high-frequency band thereof, and produces a high-frequency band emphasized signal c
1
. A linear prediction analyzer
13
performs the linear predictive analysis on the received high-frequency band emphasized signal c
1
by using the Durbin-Levinson method. The linear prediction analyzer
13
outputs a 10
th
order reflection coefficient d
1
which serves as spectral envelope information. A first quantizer
14
applies the scholar quantization to the 10
th
order reflection coefficient d
1
for each order. The first quantizer
14
sends the quantization result e
1
of a total of 41 bits to an error correction coding/bit packing unit
19
. Table 1 shows the bit allocation for the reflection coefficients of respective orders.
An RMS (i.e., Root Mean Square) calculator
15
calculates an RMS value representing the level information of the high-frequency band emphasized signal c
1
and outputs a calculated RMS value f
1
. A second quantizer
16
quantizes the RMS value f
1
to 5 bits, and outputs a quantized result g
1
to the error correction coding/bit packing unit
19
.
A pitch detection/voicing unit
17
receives the output b
1
of the framing unit
11
and outputs a pitch period h
1
(ranging from 20 to 156 samples corresponding to 51-400 Hz) and voicing information i
1
(i.e., information for discriminating voiced, unvoiced, and transitional periods). A third quantizer
18
quantizes the pitch period h
1
and the voicing information i
1
to 7 bits, and outputs a quantized result j
1
to the error correction coding/bit packing unit
19
. The quantization (i.e., allocation of the pitch information and the voicing information to the 7-bit codes, i.e., a total of 128 codewords) is performed in the following manner. The codeword having 0 in all of the 7 bits and seven codewords having 1 in only one of the 7 bits are allocated to the unvoiced state. The codeword having 1 in all of the 7 bits and seven codewords having 0 in only one of the 7 bits are allocated to the transitional state. Other codewords are used for the voiced state and allocated to the pitch period information.
The error correction coding/bit packing unit
19
packs the received information, i.e., all of the quantization result e
1
, the quantized result g
1
, and quantized result j
1
, into a 54 bit/frame to constitute a speech coding information frame. Thus, the error correction coding/bit packing unit
19
outputs a bit stream k
1
consisting of 54 bits per frame. The produced speech information bit stream k
1
is transmitted to a receiver via a modulator and a wireless device in case of the radio communications.
Table 1 shows the bit allocation per frame. As understood from this table, the error correction coding/bit packing unit
19
transmits the error correction code (20 bits) when the voicing of the current frame does not indicate the voiced state (i.e., when the voicing of the current frame indicates the unvoiced or transitional period), instead of transmitting 5
th
to 10
th
order reflection coefficients. When current frame is the unvoiced or transitional period, the information to be error protected is upper 4 bits of the RMS information and the 1
st
to 4
th
order reflection coefficient information. The sync bit of 1 bit is added to each frame.
TABLE 1
2.4 kbps LPC type Bit Allocation
parameters
voiced frame
unvoiced frame
reflection coefficient (1st order)
5
5
reflection coefficient (2nd order)
5
5
reflection coefficient (3rd order)
5
5
reflection coefficient (4th order)
5
5
reflection coefficient (5th order)
4

reflection coefficient (6th order)
4

reflection coefficient (7th order)
4

reflection coefficient (8th order)
4

reflection coefficient (9th order)
3

reflection coefficient (10th order)
2

pitch and voicing information
7
7
RMS
5

error protection

20
sync bit
1
1
unused

1
total bits/22.5 ms frame
54
54
Next, a circuit arrangement of an LPC type speech decoder will be explained with reference to FIG.
19
.
A bit separating/error correcting decoder
21
receives a speech information bit stream a
2
consisting of 54 bits for each frame and separates it into respective parameters. When the current frame is an unvoiced or in voicing transition, the bit separating/error correcting decoder
21
applies the error correction decoding processing to the corresponding bits. As a result of the above processing, the bit separating/error correcting decoder
21
outputs a pitch/voicing information bit b
2
, a 10
th
order reflection coefficient information bit e
2
and an RMS information bit g
2
.
A pitch/voicing information decoder
22
decodes the pitch/voicing information bit b
2
, and outputs a pitch period c
2
and a voicing information d
2
. A reflection coefficient decoder
23
decodes the 10
th
order reflection coefficient information bit e
2
, and outputs a 10
th
order reflection coefficient f
2
. An RMS decoder
24
decodes the RMS information bit g
2
and output an RMS information h
2
.
A parameter interpolator
25
interpolates the parameters c
2
, d
2
, f
2
and h
2
to improve the reproduced speech quality, and outputs the interpolated result (i.e., interpolated pitch period i
2
, interpolated voicing information j
2
, interpolated 10
th
order reflection coefficient o
2
, and interpolated RMS information r
2
, respectively).
Next, an excitation signal m
2
is produced in the following manne

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech decoding using mix ratio table does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech decoding using mix ratio table, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech decoding using mix ratio table will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2878295

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.