Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
2000-08-17
2004-09-21
Dorvil, Richemond (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
C704S248000, C704S269000
Reexamination Certificate
active
06795807
ABSTRACT:
REFERENCE TO COMPUTER PROGRAM LISTING ON COMPACT DISC
Included with this application is a compact disc named 09641157 which contains five separate files, together which comprise table 1 referenced in this specification. The file names, date of creation on compact disc and file sizes are as follows: Main program file appl 09641,157 Baraff.txt, created Nov. 15, 2002 of size 29.8 KB; Pitch program file appl 09641,157 Baraff.txt, created Nov. 15, 2002 of size 4.11 KB; Synth program file appl 09641,157 Baraff.txt, created Nov. 15, 2002 of size 5.47 KB; LPC program file appl 09641,157 Baraff.txt, created Nov. 15, 2002 of size 1.87 KB; and Vowel program file appl 09641,157 Baraff.txt created Nov. 15, 2002 of size 1.48 KB.
AUTHORIZATION UNDER 37 C.F.R. §1.71(d)
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyrights whatsoever.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to the field of artificial speech for laryngectomees, (a laryngeally impaired individual). It relates as well to the field of voice analysis and synthesis such as has been used in the field of communications. It also relates to the field of voice instruction and training. It also relates to the field of computer controlled prosthetics, particularly as such involves correction of human speech from a voice impaired individual to enable such individual to create natural sounding speech by creating or reproducing prosody and other natural inflections in a human-voice.
2. Description of Prior Art
There have been attempts in the past to create means to improve impaired speech, particularly from laryngeally impaired individuals. No speech devices to date have been able to capture, in sufficient detail, information about the specific speaker to recreate his/her own voice. Artificial devices to create a simulated glottal pulse with a manual ability to change frequency have been known for many years. One of the more recent devices has utilized a small loudspeaker mounted in the mouth on the laryngectomee typically on a denture. This was described in U.S. Pat. No. 5,326,349 by Baraff. Some devices which vibrate the neck have been fitted with a control to enable the user to change the pitch of the speech manually as described in U.S. Pat. No. 5,812,681 by Griffin. All of these devices have the drawback of sounding very mechanical. Even when a user has manually changed the pitch, the sound has not been close to the natural sound of the human being. In devices without myoelectric control it is still necessary for the user to time the onset and fall of the glottal pulse sound manually. This timing takes practice and corrective feedback is useful in minimizing the training time.
There are a number of reasons that laryngectomees have not been able to use previous devices to their fullest potential. Firstly, even with devices which have built in pitch control, it is extremely difficult to coordinate the fingers to imitate natural speech prosody. The speaker requires a “good ear” for speech sound coupled with a very strong desire to spend hours of practicing to gain coordination. Many laryngectomees do not possess either the desire or the skill. Secondly, some of the subtleties of creating true prosody may occur in time scales faster than could be manually controlled.
A number of schemes have been developed to create speech from text. One such process is described in the patent by Sharman, U.S. Pat. No. 5,774,854. Conventional speech systems operate in a sequential manner, hence, they do not create prosody until an entire sentence is divided into elements of speech such as words and phonemes. Most of these schemes rely on pre-programmed templates to create prosody. These schemes using a programmed template would not be useful in a real time creation of speech for the laryngectomee because they require the understanding of the word and context to be applied. Although Sharman refers to “real-time” operation, because the text is already present in sentence form, it is not in “real-time” with regard to a speech input such as in the present invention. Real-time speech to speech requires that the analysis be completed within 50 milliseconds or less, that is, well before the entire word has even been spoken. Clearly techniques which are based on understanding the word before applying prosody will not be useful to solve this problem.
A further element of the disclosed invention, the ability to simulate emotions in speech, is perhaps suggested in U.S. Pat. No. 5,860,064, which creates emotion in speech output only in a text to speech system. This system again does not operate in real time with regard to a speech to speech function.
Another feature of the present invention is its use for training of speech, insofar as it includes pattern recognition, of real time speech input. A system for recognizing and coding speech is described in the U.S. Pat. No. 5,729,694 by Holzrichter et al. This speech system relies on pre-coding parts of speech including the feature vectors as generated both by classical LPC coefficients and the inclusion of a physical mapping of the vocal tract elements by using electromagnetic radiation. The system disclosed presently does not rely on electromagnetic radiation and includes the ability to pre-program specific lessons as generated by the laryngeally impaired individual in conjunction with his speech pathologist. Other devices found in the prior art have left the control of prosody to the control of the laryngectomee and required a high level of manual dexterity to provide inflection and naturalness. In practice, very few laryngectomees use this capability because the timing and control is too difficult.
SUMMARY OF THE INVENTION
The disclosed invention provides natural prosody in real time to the speech of laryngeally impaired people (laryngectomees). The invention provides prosody through the means of software running on a digital signal processor and software program running in real time thereby providing more natural speech than is achievable through any manually controlled system.
In addition to providing prosody, the disclosed system has other capabilities providing increased naturalness including: noise cancellation of sound from a neck vibrator excitation source, feedback control to allow use of a microphone distant from the mouth, aspiration noise to mimic real speech, amplification selectively of consonants over vowels to assist in intelligibility, automatic gain control to allow for movement of the head with respect to the microphone, user selection of mood of speech, volume control, whisper speech, telephone mode, training aids, ability to interface with myoelectric signals to provide automatic hands free starting and stopping control as well as user controlled intonation, and the extraction of voice parameters from a user before laryngeal impairment to recreate the voice.
An automatic gain control system has been provided to regulate the output. The unit provides “whisper” speech by using a white noise excitation instead of the glottal pulse excitation. The unit can be used to change the excitation frequency of the sound source in real time. This is useful in use over the telephone or in a stand alone unit which may be used without the loudspeaker. Training aids using pattern recognition are programmed into the device to allow speech pathologists to provide lessons whereby the user gets feedback as to whether his articulation and time is being done according to instruction. The unit is capable of being adapted to receive myoelectric signals for hands free operation. In addition in the case of laryngeally impaired individuals with the larynx nerve replaced to a neck muscle nerve the myoelectric signal can automatically turn the unit on and off and include user directed intonat
Dorvil Richemond
Famiglio Robert B.
Famiglio & Associates
Nolan Daniel A
LandOfFree
Method and means for creating prosody in speech regeneration... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and means for creating prosody in speech regeneration..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and means for creating prosody in speech regeneration... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3247221