Patent
1994-06-10
1997-08-12
MacDonald, Allen R.
395 279, G10L 300
Patent
active
056574266
ABSTRACT:
A method and apparatus provide a video image of facial features synchronized with synthetic speech. Text input is transformed into a string of phonemes and timing data, which are transmitted to an image generation unit. At the same time, a string of synthetic speech samples is transmitted to an audio server. The audio server produces signals for an audio speaker, causing the audio signals to be continuously audibilized; additionally, the audio server initializes a timer. The image generation unit reads the timing data from the timer and, by consulting the phoneme and timing data, determines the position of the phoneme currently being audibilized. The image generation unit then calculates the facial configuration corresponding to the position in the string of phonemes, calculates the facial configuration, and causes the facial configuration to be displayed on a video device.
REFERENCES:
Morishima S, Aizawa K, Harashima H; An Intelligent Facial Image Coding Driven by Speech and Phoneme; ICASSP '89 Feb. 1989.
Waters K; A Musce Model for Animating Three Dimensional Facial Expression; ACM Computer Graphics vol. 21 No. 4 Apr. 1987.
K. Aizawa, H. Harashima, and T. Saito, "Model-Based Sysnthesis Image Coding (MBASIC) System for a Person's Face," In Signal Processing Image Communication, vol. 1, pp. 139-152, 1989.
I. Carlbom, W. Hsu, G. Klinker, R. Szeliski, K. Waters, M. Doyle, J. Gettys, K. Harris, T. Levergood, R. Palmer, M. Picart, D. Terzopoulos, D. Tonnesen, M. Vannier, and G. Wallace, "Modeling and Analysis of Empirical Data in Collaborative Environments," Communications of the ACM (CACM), 35(6):74-84, Jun. 1992.
H. Choi, S. Harashima, "Analysis and Synthesis of Facial Expression in Knowledge-Based Coding of Facial Image Sequences," In International Conference on Acoustics and Signal Processing, pp. 2737-2740, 1991.
N. Duffy, "Animation Using Image Samples," Processing Images of Faces, Ablex, New Jersey, pp. 179-201, 1992.
L. Hight, "Lip-Reader Trainer: A Computer Program for the Hearing Impaired," Proc. of the Johns Hopkins First National Search for Applications of Personal Computing to Aid the Handicapped, pp. 4-5, 1981.
J. Lewis and F. Parke, "Automatic Lip-Synch and Speech Synthesis for Character Animation," In CHI+CG '87, pp. 143-147, Toronto, 1987.
J. Moore and V. O'Connor, "Towards an Integrated Computer Package for Speech Therapy Training," Microtech Report, Bradford College of Art, 1986.
M. Oka, K. Tsutsui, A. Ohba, Y. Kurauchi, and T. Tago, "Real-Time Manipulation of Texture-Mapped Surface," Computer Graphics, 21(4):181-188, 1987.
F. Parke, "A Model of the Face that Allows Synchronized Speech," Journal of Computers and Graphics, 1(2):1-4, 1975.
F. Parke, "Parameterized Models for Facial Animation," IEEE Computer Graphics and Applications, 2(9):61-68, 1982.
"Expression control using synthetic speech." --Wyvill, et al, Department of Computer Science, University of Calgary, Calgary, Alberta, Canada, T2N 1N4, ACM Siggraph '89 Course Notes, State Of The Art In Facial Animation, 16th Annual Conf. On Computer Graphics And Interactive Techniques, Boston, Massachusetts 31 Jul.-4 Aug. 1989, pp. 163-175.
"Animating speech: an automated approach using speech synthesised by rules" --Hill, et al, The Visual Computer (1988) ACM Siggraph '89 Course Notes, State Of The Art In Facial Animation, 16th Annual Conf. On Computer Graphics And Interactive Techniques, Boston, Massachusetts 31 Jul.-4 Aug. 1989, pp. 176-188.
Levergood Thomas M.
Waters Keith
Digital Equipment Corporation
Hudgens Ronald C.
MacDonald Allen R.
Satow Clayton L.
Sax Robert
LandOfFree
Method and apparatus for producing audio-visual synthetic speech does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for producing audio-visual synthetic speech, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for producing audio-visual synthetic speech will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-167625