Phonemic unit dictionary based on shifted portions of source...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S262000, C704S264000, C704S267000

Reexamination Certificate

active

06202048

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a speech synthesis apparatus and a method to generate a synthesis speech signal by filtering a speech source signal through a synthesis filter in case of text-to-speech system.
BACKGROUND OF THE INVENTION
A speech synthesis method is a technique to automatically generate a synthesized speech signal from inputted prosodic information. According to the prosodic information such as phonemic symbols, phonemic time length, pitch pattern and power, characteristic parameter of small unit (synthesis unit) such as syllable, phoneme, one pitch interval stored in a unit dictionary memory is selected. After controlling the pitch and the continuous time length, the characteristic parameters are connected to generate a synthesis speech signal. The speech synthesis technique by this synthesis method by rule is used for text-to speech system to artificially generate a speech signal from an arbitrary text.
In this speech synthesis technique, in order to improve the quality of the synthesized speech signal, as the characteristic parameter of synthesis unit, a waveform extracted from speech data or a pair of speech source signals obtained by analyzing the speech data and coefficients representing a characteristic of the synthesis filter is used.
In the latter case, in order to further improve the quality of synthesized speech, a large number of synthesis units consisting of the speech source signal and the coefficients are stored in the unit dictionary. Suitable synthesis units are selected from the unit dictionary and connected to generate the synthesized speech. In this method, in order to avoid an increase of memory capacity of the unit dictionary, the unit dictionary is previously coded. When synthesizing the speech signal, the coded unit dictionary is decoded by referring to the codebook.
FIG. 1
is a block diagram of the speech synthesis apparatus using the coded unit dictionary information according to the prior art. First, according to the phonemic symbols
100
, the phonemic time length
101
, the pitch pattern
102
and the power
103
, a unit selection section
10
selects a coded representative synthesis unit from the unit dictionary memory
11
.
FIG. 2
is a schematic diagram of the coded synthesis unit in the unit dictionary memory
11
. As shown in
FIG. 2
, a linear predictive coefficient used as filter coefficient in the synthesis filter is stored as a code index
113
in a linear predictive coefficient codebook
22
(hereafter, it is called as the linear predictive coefficient index
113
). The speech source signal is stored as a code index
111
in a speech source signal codebook
21
(hereafter, it is called as the speech source signal index
111
). A gain is stored as a code index
110
in a gain codebook
20
(hereafter, it is called as the gain index
110
).
The coded synthesis unit selected by the unit selection section
10
is inputted to a synthesis unit decoder
12
. In the synthesis unit decoder
12
, a linear predictive coefficient requantizer
25
selects a code vector corresponding to the linear predictive coefficient index
113
from a linear predictive coefficient codebook
22
and outputs a requantized (decoded) linear predictive coefficient
122
. A speech source signal requantizer
24
selects a code vector corresponding to the speech source signal index
111
from a speech source signal codebook
21
and outputs a requantized (decoded) speech source signal. A gain requantizer
23
selects a code vector corresponding to the gain index
110
from a gain codebook
20
and outputs a requantized (decoded) gain
120
. A gain multiplier
27
multiplies the gain
120
with the speech source signal decoded by the speech source signal requantizer
24
. The linear predictive coefficient
122
decoded by the linear predictive coefficient requantizer
25
is supplied to the synthesis filter
13
as filter coefficient information. The synthesis filter
13
executes a filtering process for the speech source signal
121
multiplied with the gain
120
and generates a speech signal
123
. A pitch/time length controller
14
controls the pitch and the time length of the speech signal
123
. A unit connection section
15
connects a plurality of the speech signals whose pitch and time length are controlled. In this way, a synthesis speech signal
104
is outputted.
In this synthesis system by rule, the coded synthesis unit in the unit dictionary memory largely affects the quality of synthesized speech.
In order to rise the quality of speech, in other words, in order to suppress a falling of the quality of synthetic speech by coding, the number of bits for coding of the synthesis unit must be increased. However, if the number of bits for coding increases, the memory capacity requirement of the gain codebook
20
, the speech source signal codebook
21
, and the linear predictive coefficient codebook
22
largely increases. Especially, in case a vector-quantization is applied to the coding, the memory capacity requirement indexically increases in proportion to the increase in the number of bits for coding of the representative synthesis unit. Conversely, if the number of bits for coding of the synthesis unit decreases to decrease the memory capacity requirement, the quality of the synthesized speech goes down.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a speech synthesis apparatus and a method for generating high-quality synthetic speech without increasing the capacity requirement of the speech source signal codebook.
According to the present invention, a speech synthesis apparatus for synthesizing a speech signal by filtering a speech source signal through a synthesis filter, comprises: speech source signal codebook means for storing a plurality of speech source signals as a code vector; unit dictionary memory means for storing a plurality of synthesis units corresponding to phonemic symbols, each synthesis unit comprising an index of the code vector in said speech source signal code book means and a shift number for the code vector to decode the speech source signal; unit selection means for selecting a synthesis unit corresponding to phonemic symbols to be synthesized from said unit dictionary memory means; and synthesis unit decode means for selecting the code vector corresponding to the index in the synthesis unit from said speech source signal codebook means, and for shifting the code vector as the shift number in the synthesis unit.
Further in accordance with the present invention, there is also provided a speech synthesis method for synthesizing a speech signal by filtering a speech source signal through a synthesis filter, comprising the steps of: storing a plurality of speech source signals as a code vector in a speech source signal codebook; storing a plurality of synthesis units corresponding to each phonemic symbols, each synthesis unit comprising an index of the code vector and a shift number for the code vector to decode the speech source signal in a unit dictionary memory; selecting a synthesis unit corresponding to phonemic symbols to be synthesized from said unit dictionary memory; selecting the code vector corresponding to the index in the synthesis unit from said speech source signal codebook; and shifting the code vector according to the shift number in the synthesis unit.
Further in accordance with the present invention, there is also provided a computer readable memory containing computer-readable instructions to synthesize a speech signal by filtering a speech source signal through a synthesis filter, comprising the steps of: instruction means for causing a computer to store a plurality of speech source signals as a code vector in a speech source signal codebook; instruction means for causing a computer to store a plurality of synthesis units corresponding to each phonemic symbols, each synthesis unit comprising an index of the code vector and a shift number for the code vector to decode the speech source signal in a unit dictionary memory; instruction means for caus

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Phonemic unit dictionary based on shifted portions of source... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Phonemic unit dictionary based on shifted portions of source..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Phonemic unit dictionary based on shifted portions of source... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2484925

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.