Clustered patterns for text-to-speech synthesis

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Clustered patterns for text-to-speech synthesis Clustered patterns for text-to-speech synthesis

: 1998-09-08
: 2003-03-04
: Knepper, David D. (Department: 2648)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Synthesis

: C704S260000, C704S245000
: Reexamination Certificate
: active
: 06529874
: ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a speech information processing apparatus and a method to generate a natural pitch pattern used for text-to-speech synthesis.
BACKGROUND OF THE INVENTION
Text-to-synthesis represents the artificial generation of a speech signal from an arbitrary sentence. An ordinary text-to-speech system consists of a language processing section, a control parameter generation section, and a speech signal generation section. The language processing section executes morpheme analysis and syntax analysis for an input text. The control parameter generation section processes accent and intonation, and outputs phoneme signs, pitch pattern, and the duration of phoneme. The speech signal generation section synthesizes the speech signal.
In the text-to-speech system, an element related to the naturalness of synthesized speech is the prosody processing of the control parameter generation section. In particular, pitch pattern influences the naturalness of synthesized speech. In known text-to-speech systems, pitch pattern is generated by a simple model. Accordingly, the synthesized speech is generated as mechanical speech whose intonation is unnatural.
Recently, a method to generate the pitch pattern by using a pitch pattern extracted from natural speech has been considered. For example, in Japanese Patent Disclosure (Kokai) “PH6-236197”, unit patterns extracted from the pitch pattern of natural speech or vector-quantized unit patterns are previously memorized. The unit pattern is retrieved from a memory by input attribute or input language information. By locating and transforming the retrieved unit pattern on a time axis, the pitch pattern is generated.
In the above-mentioned text-to-speech synthesis, it is impossible to store the unit patterns suitable for all input attributes or all input language informations. Therefore, transformation of the unit pattern is necessary. For example, elasticity of the unit pattern in proportion to the duration is necessary. However, even if the unit pattern is extracted from the pitch pattern of the natural speech, the naturalness of the synthesized speech falls because of this transformation processing.
SUMMARY OF THE INVENTION
It is one object of the present invention to provide a speech information processing apparatus and a method to improve the naturalness of synthesized speech in text-to-speech synthesis.
The above and other objects are achieved according to the present invention by providing a novel apparatus, method and computer program product for generating clustered patterns for text-to-speech synthesis. In the apparatus, a representative pattern memory stores a plurality of initial representative patterns as a noise pattern. Different attribute is previously affixed to each initial representative pattern. A pitch pattern memory stores a large number of natural pitch patterns as an accent phrase. A clustering unit classifies each natural pitch pattern to the initial representative pattern based on the attribute of the accent phrase. A transformation parameter generation unit evaluates an error between a transformed representative pattern and each natural pitch pattern classified to the initial representative pattern, and generates a transformation parameter for each natural pitch pattern based on the evaluation result. A representative pattern generation unit calculates an evaluation function of the sum of the error between the transformed representative pattern an each natural pitch pattern classified to the initial representative pattern, and updates each initial representative pattern based on a result of the evaluation function. The representative pattern memory stores each updated representative pattern as a clustered pattern of the attribute affixed to the corresponding initial representative pattern.

REFERENCES:
patent: 4696042 (1987-09-01), Goudie
patent: 5384893 (1995-01-01), Hutchins
patent: 5682501 (1997-10-01), Sharman
patent: 5740320 (1998-04-01), Itoh
patent: 5832434 (1998-11-01), Meredith
patent: 5913193 (1999-06-01), Huang et al.
patent: 5913194 (1999-06-01), Karaali et al.
patent: 5949961 (1999-09-01), Sharman
patent: 5970453 (1999-10-01), Sharman
patent: 6138089 (2000-10-01), Guberman
patent: 6240384 (2001-05-01), Kagoshima et al.
X. Huang, et al., “Recent Improvements on Microsoft's Trainable Text-to-Speech System—Whistler”, Proc. of ICASSP97, Apr. 1997, pp. 959-962.

Affiliated with

Akamine Masami

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kagoshima Takehiko

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Morita Masahiro

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Nii Takaaki

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Seto Shigenobu

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Knepper David D.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Clustered patterns for text-to-speech synthesis does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Clustered patterns for text-to-speech synthesis, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Clustered patterns for text-to-speech synthesis will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3020084

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure