Phonetic unit duration adjustment for text-to-speech system

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Phonetic unit duration adjustment for text-to-speech system Phonetic unit duration adjustment for text-to-speech system

: 1997-12-11
: 2001-12-11
: Smits, Talivaldis Ivars (Department: 2641)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Synthesis

: C704S267000
: Reexamination Certificate
: active
: 06330538
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is concerned with speech synthesis, and particularly, though not exclusively, with text-to-speech synthesisers which operate by concatenating segments of stored speech waveforms.
2. Related Art
Various prior art systems have been devised for converting text to synthesized speech. While these systems provide associated techniques to determine the timing and duration of synthesized phonetic units, it is believed that there is room for improvement in this regard.
BRIEF SUMMARY OF THE INVENTION
According to the present invention there is provided a speech synthesiser comprising:
means for supplying a sequence of representations of phonetic units;
means for retrieving stored portions of data to generate waveforms corresponding to the phonetic units;
means for determining durations for the phonetic units: and
means for processing the portions of data to adjust the time durations of the waveforms according to the determined durations;
wherein the determining means is operable to define a constant duration corresponding to a regular beat period and to adjust that duration in dependence on the nature of the phonetic unit and/or its context within the sequence.
Preferably the stored data are themselves digitised speech waveforms (though this is not essential and the invention may also be applied to other types of synthesiser such as formant synthesisers). Thus in a preferred arrangement the synthesiser includes a store containing items of data representing waveforms corresponding to phonetic sub-units, the retrieving means being operable to retrieve, for each phonetic unit, one or more portions of data each corresponding to a sub-unit thereof, and a further store containing for each sub-unit statistical duration data including a maximum value and a minimum value, wherein the determining means is operable to compute for each phonetic unit the sum of the minimum duration values and the sum of the maximum duration values for the constituent sub-unit(s) thereof and to adjust the said constant duration such that it neither falls below the sum of the minimum values nor exceeds the sum of the maximum values.
In the preferred embodiment the phonetic units are syllables and the sub-units are phonemes.

REFERENCES:
patent: 5479564 (1995-12-01), Vogten et al.
patent: 5832434 (1998-11-01), Meredith
patent: 6038533 (2000-03-01), Buchsbaum et al.
patent: 6064960 (2000-05-01), Bellegarda et al.
patent: 0 327 266 (1989-08-01), None
Patent Abstract of Japanese Appl. No. 05-108084, vol. 17, No. 464 (P-1599), Aug. 24, 1993.
Patent Abstract of Japanese Appl. No. 06-161491, Vol. 18, 484 (P-1798), Sep. 8, 1994.
Ahn et al., “The Rules in a Korean Text-to-Speech System”, Proceedings of the International Conference on Sopken Language Processing 1990, vol. 2, Nov. 18-22, 1990, pp. 777-780.
Bailly, “Integration of Rhythmic and Syntactic . . . ”, Speech Communication, vol. 8, No. 2, Jun. 1989, pp. 137-146.
Dettweiler, “An Approach to Demisyllable Speech . . . ”, International Conference on Acoustics, Speech and Signal Processing 1981, vol. 1, Mar. 30, 1981—Apr. 1, 1981, pp. 110-113.
Hirokawa et al., “High Quality Speech Synthesis . . . ”, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 76A, No. 11, Nov. 1, 1993, pp. 1964-1970.
Ladd et al., “Modeling Rhythmic and Syntactic . . . ”, European Conference on Speech Technology, vol. 2, Sep. 1987, pp. 29-32.
Van Santen, “Assignment of Segmental Duration . . . ”, Computer Speech and Language, vol. 8, No. 2, Apr. 1, 1994, pp. 95-128.
Crystal et al, “Segmental Durations in Connected-Speech Signals: Current Results”, Journal of the Acoustical Society of America 83(4), pp. 1553-1573, Apr. 1988.
Weightman, “Segmental Durations in the Vicinity of Prosodic Phrase Boundaries”, Journal of the Acoustical Society of America 91 (3), pp. 1707-1717, Mar. 1992.
Breen, “A Comparison of Statistical and Rule Based Methods of Determining Segmental Durations”, Proceedings of ICSLP '92, pp. 1199-1202.
Breen et al, “A Method of Estimating Segmental Durations”, Proceedings of the Institute of Acoustics, (1994), pp. 343-350.
Riley, “Tree-Based Modelling of Segmental Durations”, Talking Machines, Theories, Models, and Designs, Ed. Baily, Benoit, North-Holland, pp. 265-275, (1992).
Klatt, “Synthesis by Rule of Segmental Durations in English Sentences”, Frontiers of Speech Communication Research, Ed. Lindblom and Ohman, Academic Press, pp. 287-300 (1979).
van Santen, “Using Statistics in Text-to-Speech System Construction”, Proceedings of the Second ESCA/IEEE, Workshop on Speech Synthesis, pp. 240-243, (1994).
Sharman, “Concatenative Speech Synthesis Using Sub-Phoneme Segments”, Proceedings of the Institute of Acoustics, pp. 367-374, (1994).
Campbell et al, “Segmant Durations in a Syllable Frame”, Journal of Phonetics (1991), Special Issue on Speech Synthesis, 19, pp. 37-47.
Campbell, “Syllable-Based Segmental Duration”, Talking Machines, Theories, Models and Designs, Ed. Baily, Benoit, North-Holland, pp. 211-224, (1992).
Campbell, “Predicting Segmental Durations for Accomodation within a Syllable-Level timing Framework”, Conference Proceedings of Eurospeech '93, (1993), pp. 1081-1085.
Local et al, “A Model of Timing for Non-Segmental Phonological Structure”, Conference Proceedings of the Second ESCA/IEEE Workshop on Speech Synthesis, pp. 236-239, (1994).
Breen, The BT Laureate Text-to-Speech System, Proceedings of the Second ESCA/IEEE Workshop on Speech Synthesis, pp. 195-198, (1994).

Affiliated with

Breen Andrew P

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

British Telecommunications public limited company

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Nixon & Vanderhye P.C.

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Smits Talivaldis Ivars

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Phonetic unit duration adjustment for text-to-speech system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Phonetic unit duration adjustment for text-to-speech system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Phonetic unit duration adjustment for text-to-speech system will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2564977

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure