Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis
Reexamination Certificate
1999-04-29
2001-01-23
Hudspeth, David R. (Department: 2741)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Synthesis
C704S258000, C704S260000
Reexamination Certificate
active
06178402
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to text-to-speech synthesis, and more particularly, to acoustic parameter generation in neural network based text-to-speech synthesis.
BACKGROUND
During a text-to-speech conversion process, a linguistic representation of text is typically converted into a series of acoustic parameter vectors. Typically, these parameters are then converted into parameters used by a vocoder in order to generate a final speech signal.
Neural networks have been used to compute each vector of acoustic parameters, representing many computations for each second of speech. This can be a significant portion of the computational time for neural network based text-to-speech conversion.
Accordingly, there is a need for a neural network system that reduces the computation requirements for converting a linguistic representation into an acoustic representation.
SUMMARY OF THE INVENTION
A method in accordance with the present invention generates a series of acoustic descriptions in a text-to-speech system based upon a linguistic description of text. The method includes the steps of generating an information vector for each segment description in the linguistic description, wherein the information vector includes a description of a sequence of segments surrounding a described segment, using a neural network to generate a representation of a trajectory of acoustic parameters, the trajectory being associated with the described segment. The method also includes the step of generating the series of descriptions by computing points on the trajectory at identified instants.
An apparatus in accordance with the present invention generates a series of acoustic descriptions in text-to-speech system based upon a linguistic description of text. The apparatus includes a linguistic information preprocessor to receive the linguistic description and to generate an information vector for each segment description in the linguistic description, wherein the information vector includes a description of a sequence of segments surrounding a described segment. The apparatus also includes a neural network, operably coupled to the linguistic information preprocessor, to generate a representation of a trajectory of acoustic parameters, with trajectory being associated with the described segment. The apparatus further includes a trajectory computation unit operably coupled to a neural network, to generate the series of descriptions by computing points on the trajectory at identified instants.
A text-to-speech synthesizer in accordance with the present invention generates a series of acoustic descriptions in a text-to-speech system based upon a linguistic description of text. The synthesizer includes a linguistic information preprocessor to receive the linguistic description and generates an information vector for each segment description in an linguistic description, wherein the information vector includes a description of a sequence of segments surrounding a described segment. The synthesizer also includes a neural network, operably coupled to the linguistic information preprocessor, to generate a representation of a trajectory in a space for acoustic parameters with the trajectory being associated with the described segment. The synthesizer includes a trajectory computation unit, operably coupled to the neural network, to generate the series of descriptions by computing points on the trajectory at identified instants.
The foregoing and other features and advantages of the invention will become further apparent from the following detailed description of the presently preferred embodiments, read in conjunction with the accompanying drawings. The detailed description and drawings are merely illustrative of the invention rather than limiting, the scope of the invention being defined by the appended claims and equivalents thereof.
REFERENCES:
patent: 3632887 (1972-01-01), Leipp
patent: 3704345 (1972-11-01), Coker et al.
patent: 5041983 (1991-08-01), Nakahara et al.
patent: 5163111 (1992-11-01), Bajhi et al.
patent: 5230037 (1993-07-01), Giustiniani
patent: 5327498 (1994-07-01), Hamon
patent: 5463713 (1995-10-01), Hasegawa
patent: 5472796 (1995-12-01), Iwata
patent: 5610812 (1997-03-01), Schabes et al.
patent: 5627942 (1997-05-01), Nightingale et al.
patent: 5642466 (1997-06-01), Narayan
patent: 5652828 (1997-07-01), Silverman
patent: 5668926 (1997-09-01), Karaali et al.
patent: 5751907 (1998-05-01), Moebius et al.
patent: 5884267 (1999-03-01), Goldenthal et al.
patent: 5890117 (1999-03-01), Silverman
patent: 5913194 (1999-06-01), Karaali et al.
patent: 5950162 (1999-09-01), Corrigan et al.
patent: 6052481 (2000-04-01), Grajski et al.
patent: 6052662 (2000-04-01), Hogden
patent: WO 89/02134 (1989-03-01), None
Scorkilis et al., “Text Processing for Speech Synthesis Using Parallel Distributed Models”, 1989 IEEE Proc, Apr. 9-12, 1989, pp. 765-769, vol. 2.
Tuerk et al., “The Development of Connectionist Multiple-Voice Text-To Speech System” Int'l Conf on Acoustics Speech & Signal Processing, May 14-17, 1991 pp. 749-752 vol. 2.
Weijters et al., “Speech Synthesis with Artificial Neural Networks”, 1993 IEEE Int'l Conference on Neural Networks, San Francisco, CA, Mar. 28-Apr. 1, vol. 3, pp. 1264-1269.
Azad Abul K.
Gauger James E.
Hudspeth David R.
Motorola Inc.
LandOfFree
Method, apparatus and system for generating acoustic... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method, apparatus and system for generating acoustic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, apparatus and system for generating acoustic... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2531358