Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis
Reexamination Certificate
1998-11-17
2001-02-13
Dorvil, Richemond (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Synthesis
C704S251000, C704S254000
Reexamination Certificate
active
06188984
ABSTRACT:
BACKGROUND
1. Field of the Invention
The present invention generally relates to syllable parsing, and more particularly, it relates to a method and system for converting text into phonetic syllables.
2. Related Art
Many devices currently use computer-generated speech for users' convenience. Automatically generating speech devices range from large computers to small, electronic devices. For example, an automatic telephone answering system, such as voicemail, can interact with a caller through synthesized voice prompts. A computer banking system can report account information via speech. On a smaller scale, a talking clock can announce the time. The use of talking devices is increasingly expanding and will continue to expand as innovation and technology progresses.
Often, for ease-of-use, synthesized speech is generated from text inputted to a speech generating device. These devices receive text, translate it, and output sound in the form of speech through a speaker. However, when translating and reciting the text, these devices do not always speak as clearly and naturally as a human does, therefore synthesized speech is recognizably artificial.
Making a computer or electronic device produce natural sounding speech requires a keen understanding of the nuances of the language and can be difficult for programmers. Computer-generated speech often seems unnatural for a variety of reasons. Some systems pre-record verbal responses in audio files, but when the words are played back in a different order than they were recorded, the response can sound extremely unnatural. One key aspect in the production of natural sounding, computer-generated speech is the ability to recognize boundaries between syllables. The recognition of syllable boundaries allows a speech-generating computer to speak in a more natural manner. The production of more natural sounding synthesized speech would further integrate computers into society and make them seem more user-friendly.
Automatic speech recognition (“ASR”) devices perform the reverse function of text-to-speech devices. Computers and other electronic devices are increasingly using ASR as a form of input from a user. ASR applications range from word processing to controlling basic functions of electronic devices, such as automatically dialing a telephone number associated with a spoken name. ASR functions are implemented using computationally intensive programs and algorithms. A thorough understanding of boundaries between syllables in a language also makes the precise recognition of speech easier. Greater understanding of the segmentation of a speech signal improves the recognition of the speech signal.
Accordingly, to improve computer speech production and recognition, it is desirable to provide a system that recognizes syllable boundaries.
SUMMARY
Systems and methods consistent with the present invention satisfy this and other desires by providing a method for parsing text into syllables. In accordance with the present invention, a method and system is provided that parses text into “phonemes,” basic units of pronounceable and audible speech, divided at syllable boundaries. The phonetic syllables can then be used by other computer speech applications, such as text-to-speech devices to produce smooth, natural sounding speech.
In accordance with methods consistent with the present invention, a method for parsing syllables is provided in a data processing system. This method receives a text string, converts the text string into a phoneme sequence, and generates a transformed phoneme sequence from the phoneme sequence according to transformation rules. The method further ranks the phonemes of the transformed phoneme sequence, generates a syllable rank meter for the transformed phoneme sequence, and transforms the transformed phoneme sequence into syllables using the syllable rank meter.
The advantages accruing to the present invention are numerous. It allows text to be automatically converted into phonetic syllables. These phonetic syllables can then be used by a text-to-speech computer application to produce natural sounding, computer-generated speech. Making automatically-generated speech sound more natural can increase a user's comprehension of the generating device and make the device more pleasing to the ear. Additionally, voice recognition systems can use the information of the syllable boundaries to improve speech recognition.
The above features, other features and advantages of the present invention will be readily appreciated by one of ordinary skill in the art from the following detailed description of the preferred implementations when taken in connection with the accompanying drawings.
REFERENCES:
patent: 4811400 (1989-03-01), Fisher
patent: 4831654 (1989-05-01), Dick
patent: 5528728 (1996-06-01), Matsuura et al.
patent: 5651095 (1997-07-01), Ogden
patent: 5732395 (1998-03-01), Silverman
patent: 5758023 (1998-05-01), Bordeaux
patent: 5852802 (1998-12-01), Breen et al.
Michel Divay and Anthony J. Vitale, “Algorithms for Grapheme-Phoneme Translation for English and French: Applications for Database Searches and Speech Synthesis,” Computational Linguistics, US, Cambridge, MA, vol. 23, No. 4, pp. 495-523, XP002110490, Dec. 1997 (1997-12).
M. Edgington et al., “Overview of Current Text-To-Speech Techniques: Parti-Test and Linguistic Analysis,” BT Technology Journal, vol. 14, No. 1, pp. 68-83, Jan. (1996).
IBM Technical Disclosure Bulletin, “Rule-Based Speech Synthesis Method Using Context-Dependent Syllabic Units,” vol. 38, No. 12, pp. 521-522, Dec. 1995.
Blackburn Starla
Felix Kara
Manwaring Michael E.
McDaniel Steven F.
Wallentine Melissa
Dorvil Richemond
Finnegan, Hendersom, Farabow, Garrett & Dunner, L.L.P.
Fonix Corporation
Nolan Daniel A
LandOfFree
Method and system for syllable parsing does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for syllable parsing, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for syllable parsing will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2583022