Electrical audio signal processing systems and devices – Artificial larynx – electrical
Reexamination Certificate
2000-08-29
2002-03-19
Harvey, Minsun Oh (Department: 2644)
Electrical audio signal processing systems and devices
Artificial larynx, electrical
C623S009000
Reexamination Certificate
active
06359988
ABSTRACT:
BACKGROUND OF THE INVENTION
A Transcutaneous Artificial Larynx (TAL) such as the Servox Inton provides a mean of verbal communication for people who have either undergone a laryngectomy or are otherwise unable to use their larynx (for example, after a tracheotomy). These devices are vibrating impulse sources held against the neck. Although some of these devices give users a choice of two frequency rates at which they can vibrate, most users find it cumbersome to switch between frequencies, even when a dial can be used for continuous pitch variation (as in the case of some of the Cooper Rand devices). Thus, the frequency of the excitation signal provided to the vocal tract from TAL devices is usually constant.
In contrast, natural speech has many pitch variations. The fundamental frequency (F
0
) may change several times in a single phone and may signal stress and syntactic information [1]. While most phones will have a simple rising or falling F
0
pattern, some phones may contain a ‘rise+fall+rise’ contour. Thus, the inability to vary the pitch during TAL speech is a real shortcoming that contributes to the monotonous and unnatural quality of TAL speech.
Another source of degradation in TAL speech is the presence of a steady background signal (“noise”) due to the leakage of acoustic energy from the TAL, its interface with the neck, and the surrounding neck tissue. In [2], an adaptive filtering technique was developed to remove this background noise. Perceptual experiments showed a substantial improvement in the speech quality.
A cepstral processing method we used to overcome the problems in TAL speech addressed above.
SUMMARY OF THE INVENTION
The present invention improves the quality of speech produced by users of Transcutaneous Artificial Larynges (TAL). Two major reasons why TAL speech sounds unnatural are (1) the monotone quality due to the constant rate at which the TAL device vibrates and (2) the signal that radiates from the device and the neck tissue surrounding the placement of the device on the neck. We refer to the radiated signal as “noise.”
Technical Description of Invention
The technology developed processes the TAL speech in real time using several stages. First, landmarks associated with the manner in which speech is produced are detected. These landmarks divided TAL speech into the following regions: Sonorant (includes vowels, nasals and semivowels), Stops, Fricatives, True Silence and Silence (this latter category are regions where no speech signal is present; however, the TAL device is turned on so that there is still radiated noise).
Second, the sonorant regions are processed so that the constant source is replaced by a more natural source. To do this, cepstral analysis is used to deconvolve the TAL speech into (a) vocal tract information and (b) excitation information. Cepstral processing is also performed on natural speech as well. Then the excitation signal from the natural speech is convolved with the vocal tract information from TAL speech to produce the new TAL signal with varying pitch. The portion of the natural speech signal used depends on the type of pitch contour desired. If the portion of the TAL speech is at the beginning of an utterance, then we want a pitch contour that is rising. Subsequent portions of the TAL speech signal will be processed with a rising pitch contour until a stressed syllable is reached (determined by the duration of the sonorant region). Once a stressed syllable is reached, then the TAL speech is processed using natural speech that has a falling pitch contour. However, if the sonorant regions is very long, then a pitch contour that has a rise-fall-rise pattern will be used.
Third, fricative regions in the TAL speech are processed with an excitation signal extracted from a fricative region in natural speech. Similarly, stop regions in the TAL speech are processed with an excitation signal extracted from a stop region in natural speech. The same processing is used for silent regions.
A side benefit of using cepstral analysis to change the excitation signal of TAL speech with an excitation signal from natural speech is that the radiated noise in the TAL speech is also removed.
Advantages and Improvements Over Existing Methods, Devices or Materials
Presently, some TAL devices allow the users to change the rate at which it vibrates by pushing one of two buttons on the device. That is, the user has a choice of two frequencies. In addition, the Cooper Rand devices allow users to turn a knob on the device to change the rate at which it vibrates. However, from our experience with TAL users and from our conversations with speech pathologists, most users do not use any of the options. This is probably the case for several reasons. First, normal speakers naturally change their pitch without thinking about it. Thus, it is probably too difficult for speakers to stay conscious of changing their pitch. In addition, changing the rate at which these devices vibrate by using the thumb or some other finger requires too much dexterity.
As far as we know, no one has previously attempted to develop a separate device to introduce pitch variations in TAL speech.
Possible Variations and Modifications
We will continue to explore how to make the pitch changes even better. Presently, we are looking at different smoothing techniques to concatenate the different portions of the TAL speech to make sure we don't introduce any unwanted discontinuities.
Features Believed to be New
(1) The use of a landmark detection program to divide the speech signal into regions.
(2) The application of cepstsral processing to this type of problem.
(3) The algorithm for generation the F
0
contour (pitch variation).
Problem Solved
Improving the quality of TAL speech by (a) introducing realistic pitch variation and (b) removing the radiated background noise.
Possible Uses of Invention
The invention will be incorporated into a device that TAL users can by to improve their speech when talking over the telephone or in some other electronic mediated situation.
Disadvantages Or Limitations
Presently, the invention can be used only to improve TAL speech in electronically mediated situations.
REFERENCES:
patent: 4338488 (1982-07-01), Lennox
patent: 5326349 (1994-07-01), Baraff
patent: 07 000433 (1995-06-01), None
Arslan, L.M., and Talkin, D., “Speaker Transformation Using Sentence HMM Based Alignments and Detailed Prosody Modification,”NY:IEEE, US, 23:289-292 (1998).
Qi, Yingyong,“Replacing Tracheosophageal Voicing Sources Using LPC Synthesis,”The Journal of the Acoustical Society of America, 88(3) :1228-1235 (1990).
Cole, David, et al., “Application of Noise Reduction Techniques for Alaryngeal Speech Enhancement”,Speech and Image Technologies for Computing and Telecommunications, 491-493 (1997).
Parsa, V. and Jamieson, D.G., “A Comparison of High Precision F0 Extraction Algorithms for Sustained Vowels”,Journal of Speech, Language, and Hearing Research, 42:112-126 (1999).
Shute, B., “Overcoming the Hurdle of Controlling Stoma Noise”,Advance for Speech-Language Pathologists&Audiologists, Feb. 24, 1997.
Shute, B., “Current Trends in Electronic Larynges”,Advance for Speech-Language Pathologists&Audiologists, Jul. 25,. 1994.
Eady, S.J., “Differences in the F0Patterns of Speech: Tone Language Versus Stress Language,”Language&Speech, 25:29-42 (1982).
Espy-Wilson, C.Y., et al., “Enhancement of Electrolaryngeal Speech by Adaptive Filtering”,Journal of Speech, Language, and Hearing Research, 41:1253-1264 (1998).
Oppenheim, A.V., “Speech Analysis-Synthesis Based on Homomorphic Filtering”,J. Acoust. Soc. of Amer., 45(2):458-465 (1969).
Hamilton Brook Smith & Reynolds P.C.
Harvey Minsun Oh
Trustees of Boston University
LandOfFree
Process for introduce realistic pitch variation in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Process for introduce realistic pitch variation in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Process for introduce realistic pitch variation in... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2857947