Image synthesis

Computer graphics processing and selective visual display system – Computer graphics processing – Animation

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C345S474000

Reexamination Certificate

active

06208356

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to synthesis of moving images, for example to accompany synthetic speech.
2. Related Art
In prior art such as European Patent Application 0,225,729 visual images of the face of a speaker are processed to extract during a learning sequence a still frame of the image and a set of typical mouth shapes. Encoding of a sequence to be transmitted, recorded etc. is then achieved by matching the changing mouth shapes to those of the set and generating codewords identifying them. Alternatively, the codewords may be generated to accompany real or synthetic speech using a look-up table relating speech parameters to codewords. In a receiver, the still frames and set of mouth shapes are stored and received codewords used to select successive mouth shapes to be incorporated in still frame. However, such prior art leaves room for improvement in transitioning between sounds from two specific groups of phonemes.
SUMMARY OF THE INVENTION
Hitherto, the synthesis of an image of a face to accompany an utterance has relied on the selection of facial images corresponding to the phonemes in the utterance—intervening images are provided by interpolation between those facial images. One example of such an image synthesiser is disclosed in a paper presented by Shigeo Morishima et al. and entitled ‘A Facial Motion Synthesis for Intelligent Man-Machine interface’ at pages 50-59 in Systems and Computers in Japan 22(1991) No. 5. Another example is disclosed in U.S. Pat. No. 5,313,522.
According to the present invention there is provided a method of generating signals representing a moving picture of a face having visible articulation matching a spoken utterance, comprising: receiving a sequence of phonetic representations corresponding to successive portions of the utterance; identifying a mouth shape for each phonetic representation of a first type; identifying a mouth shape for each transition from a phonetic representation of the first type to a phonetic representation of a second type, for each transition from a phonetic representation of the second type to a phonetic representation of a first type and for each transition from a phonetic representation of the second type to a phonetic representation of the second type; and generating a sequence of image frames including the identified shapes.
The first and second types may be vowels and consonants respectively; thus, a preferred embodiment of the invention provides a method of generating signals representing a moving picture of a face having visible articulation matching a spoken utterance, comprising: receiving a sequence of phonetic representations corresponding to successive phonemes of the utterance; identifying a mouth shape for each vowel phoneme; identifying a mouth shape for each transition from a vowel phoneme to a consonant phoneme, for each transition from a consonant phoneme to a vowel phoneme and for each transition from a consonant phoneme to a consonant phoneme; and generating a sequence of image frames including the identified shapes.
The identification of a mouth shape for each transition between consonant and vowel phonemes may be performed as a function of the vowel phoneme and the consonant phoneme, whilst the identification of a mouth shape for each transition between two consonant phonemes may be performed as a function of the first of the two consonant phonemes and of the vowel phoneme which most closely follows or precedes it. Alternatively the identification of a mouth shape for each transition between two consonant phonemes may be performed as a function of the first of the two consonant phonemes and of the vowel phoneme which most closely follows it or in the absence thereof that which precedes it.
Preferably the identification for each transition is performed as a function of only those phonemes specified above in relation to those transitions. Alternatively, the identification could be performed as a function also of at least one other phoneme within the same word.
In a preferred arrangement, one may generate for each identified mouth shape a command specifying that shape and intermediate commands each of which specifies a shape intermediate the shapes specified by the preceding and following commands.
In another aspect of the invention there is provided an apparatus for generating signals representing a moving picture of a face having visible articulation matching a spoken utterance, comprising:
means arranged in operation to receive a sequence of phonetic representations corresponding to successive portions of the utterance and in response thereto to
identify a mouth shape for each phonetic representation of a first type and
identify a mouth shape for each transition from a phonetic representation of the first type to a phonetic representation of a second type, for each transition from a phonetic representation of the second type to a phonetic representation of a first type and for each transition from a phonetic representation of the second type to a phonetic representation of the second type;
and means for generating a sequence of image frames including the identified shapes.
One embodiment of the invention will now be described, by way of example, with reference to the accompanying drawings, in which:


REFERENCES:
patent: 4913539 (1990-04-01), Lewis
patent: 5313522 (1994-05-01), Slager
patent: 5546518 (1996-08-01), Blossom et al.
patent: 5548693 (1996-08-01), Shinya
patent: 5568602 (1996-10-01), Callahan et al.
patent: 5577175 (1996-11-01), Naka et al.
patent: 689 362 (1995-12-01), None
patent: 2 231 246 (1990-11-01), None
Gasper Elon, “Getting a Head with Hyperanimation”, Dr. Dobb's Journal of Software Tools, Jul. 1998, vol. 13, No. 7, pp. 18-34, ISSN 0888-3076 QH76.D617.
Lewis J. P. and Parke F. I., “Automated Lip-Synch and Speech Synthesis for Character Animation”, SIGCHI Bulletin, spec. issue, pp. 143-7, ISSN 0736-6906 QH 76.76.H94 H95 1989.
Lobanov B., “On the Acoustic Theory of Coarticulation and Reduction”, International Conference on Acoustic, Speech and Signal Processing, 1982, vol. 2, pp. 915-8, TK7882.565137a.
Pelachud C., Badler N. I., & Steedman M., “Linguistic issues in facial animation” in N. M. Thalmann and D. Thalmann (Eds) Computer Animation ′92, Tokyo, Springer-Verlag TR897.5. C657 1991.
Welsh, et al., “Model-Based Image Coding”, BR TELECOM TECHNOL J., vol. 8, No. 3, Jul. 1990, pp. 94-106.
Mahoney, “Facial Animation”, COMPUTER GRPAHIC WORLD, Jan. 1995, pp. 60-62.
Luo et al., “A Novel Approach For Classifying Continuous Speech Into Visible Mouth-Shape Related Classes”, 1994 IEEE, pp. I-465-I-468.
Morishima et al., “Facial Animation Synthesis or Human-Machine Communication System”, International Conference On Human-Computer Interaction, vol. 2, pp. 1085-90.
Cohen, et al., “Modeling Coarticulation in Synthetic Visual Speech”, in N.M. Thalmann and D. Thalmann (eds) Models and Techniques in Computer Animation ″92, Tokyo, Springer-Verlag, pp. 139-157.
Morishima, et al. “A Real-Time Facial Action Image Synthesis System Driven by Speech and Text”, SPIE, vol. 1360, Visual Communications and Image Processing ″90, pp. 1151-1158.
Carraro et al., “A Telephonic Lip Reading Device for The Hearing Impaired”, Conference Colloquim “Biomedical Applications of Digital Signal Processing”, Digest No. 144, pp. 10/1-8.
Bothe et al., “Artificial Visual Speech Synchronized with a Speech Synthesis System”, conf., “Computers for Handicapped Persons”, 1994, pp. 32-37.
Montgomery et al., “The Use of Visible Lip Information In Automatic Speech Recognition”, PROCEEDINGS OF EUSIPCO-86: Third European Signal Processing Conference, vol. 1, pp. 577-580.
Cohen et al., “Development and Experimentation with Synthetic Visible Speech”, May 1994, Behaviour Research Methods, Instruments & Computers, vol. 26, No. 2, pp. 260-5.
Page et al., “The Laureate Text-to-Speech System -Architecture and Applications”, BT TECHNOL J, vol. 14, No. 1, Jan. 1996, pp. 57-67.
SYSTEMS & COMPUTERS IN JAPAN, vol. 22, no. 5, 1 Jan. 1991, pp. 50-59, Morishima e

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Image synthesis does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Image synthesis, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Image synthesis will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2474809

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.