Method of modeling objects to synthesize three-dimensional,...

Computer graphics processing and selective visual display system – Computer graphics processing – Animation

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06504546

ABSTRACT:

TECHNICAL FIELD
The present invention relates to a method for modeling three-dimensional objects and, more particularly, to modeling three-dimensional objects using a data-driven approach with separate three-dimensional image planes so as to synthesize three-dimensional, photo-realistic animations.
BACKGROUND OF THE INVENTION
Animated characters, and in particular, talking heads, are playing an increasingly important role in computer interfaces. An animated talking head immediately attracts the attention of a user, can make a task more engaging, and can add entertainment value to an application. Seeing a face makes many people feel more comfortable interacting with a computer. With respect to learning tasks, several researchers have reported that animated characters can increase the attention span of the user, and hence improve learning results. When used as avatars, lively talking heads can make an encounter in a virtual world more engaging. In today's practice, such heads are usually either recorded video clips of real people or cartoon characters lip-synching synthesized text.
Even though a cartoon character or robot-like face may provide an acceptable video image, it has been found that people respond to nothing as strongly as a real face. For an educational program, for example, a real face is preferable. A cartoon face is associated with entertainment, not to be taken too seriously. An animated face of a competent teacher, on the other hand, can create an atmosphere conducive to learning and therefore increase the impact of such educational software.
Generating animated talking heads that look like real people is a very challenging task, and so far all synthesized heads are still far from reaching this goal. To be considered natural, a face has to be not just photo-realistic in appearance, but must also exhibit realistic head movements, emotional expressions, and proper plastic deformations of the lips synchronized with the speech. Humans are trained from birth to recognize faces and facial expressions and are therefore highly sensitive to the slightest imperfections in a talking face.
Many different systems exist in the prior art for modeling the human head, achieving various degrees of photo-realism and flexibility, but relatively few have demonstrated a complete talking head functionality. Most approaches use 3D meshes to model in fine detail the shape of the head. See, for example, an article entitled “Automatic 3D Cloning and Real-Time Animation of a Human Face”, by M. Escher et al., appearing in the
Proceedings of Computer Animation
, pp. 58-66, 1997. These models are created by using advanced 3D scanning techniques, such as a CyberWare range scanner, or are adapted from generic models using either optical flow constraints or facial features labeling. Some of the models include information on how to move vertices according to physical properties of the skin and the underlying muscles. To obtain a natural appearance, they typically use images of a person that are texture-mapped onto the 3D model. Yet, when plastic deformations occur, the texture images are distorted, resulting in visible artifacts. Another difficult problem is modeling of hair and such surface features as grooves and wrinkles. These are important for the appearance of a face, and yet are only marginally (if at all) modeled by most of the prior art systems. The incredible complexity of plastic deformations in talking faces makes precise modeling extremely difficult. Simplification of the models results in unnatural appearances and synthetic-looking faces.
An alternative approach to the 3D modeling is based on morphing between 2D images. These techniques can produce photo-realistic images of new shapes by interpolating between two existing shapes. Morphing of a face requires precise specifications of the displacements of many points in order to guarantee that the results look like real faces. Most morphing techniques therefore rely on a manual specification of the morph parameters, as discussed in the article “View Morphing”, by S. M. Seitz et al., appearing in
Proceedings of SIGGRAPH
'96, pp. 21-30, July 1996. Others have proposed image analysis methods where the morph parameters are determined automatically, based on optical flow. While this approach gives an acceptable solution to generating new views from a set of reference images, the proper reference images must still be found to initialize the process. Moreover, since the techniques are based on 2D images, the range of expressions and movements they can produce is rather limited.
Recently, there has been a surge of interest in sample-based techniques (also referred to as data-driven) for synthesizing photo-realistic scenes. These techniques generally start by observing and collecting samples that are representative of a signal to be modeled. The samples are then parameterized so that they can be recalled at synthesis time. Typically, samples are processed as little as possible to avoid distortions. One of the early successful applications of this concept is QuickTime® VR, as discussed in the article “QuickTime® VR—An Image-Based Approach to Virtual Environment Navigation”, by E. L. Chen et al., appearing in
Proceedings SIGGRAPH
'95, pp. 29-38, July 1995. The Chen et al. system allows panoramic viewing of scenes as well as examining objects from all angles. Samples are parameterized by the direction from which they were recorded and stored in a two-dimensional database.
Recently, other researchers have explored ways of sampling both texture and 3D geometry of faces, producing realistic animations of facial expressions. One example of such sampling is discussed in an article entitled “Synthesizing Realistic Facial Expressions from Photographs”, by F. Pighin et al., appearing in
Proceedings SIGGRAPH
'98, pp. 75-84, July 1998. The Pighin et al. system uses multiple cameras or facial markers to derive the 3D geometry and texture of the face in each frame of video sequences. However, deriving the exact geometry of such details as groves, wrinkles, lips and tongue as they undergo plastic deformations prove a difficult task. Extensive manual measuring in the images is required, resulting in a labor-intensive capture process. Textures are processed extensively to match the underlying 3D model and may loose some of their natural appearance. None of these prior art systems have yet been demonstrated for speech reproduction.
A talking-head synthesis technique based on recorded samples that are selected automatically has been proposed in the article “Video Rewrite: Driving Visual Speech with Audio”, by C. Bregler et al, appearing in
Proceedings SIGGRAPH '
97, pp. 353-360, July 1997. The Bregler et al. system can produce videos of real people uttering text they never actually said. It uses video snippets of tri-phones (three sequential phonemes) as samples. Since these video snippets are parameterized with the phoneme sequence, the resulting database is very large. Moreover, this parameterization can only be applied to the mouth area, precluding the use of other facial parts, such as eyes and eyebrows, which are known to carry important conversational clues.
T. Ezzat et al., in an article entitled “MikeTalk: A Talking Facial Display Based on Morphing Visemes”, appearing in the
Proceedings of Computer Animation
, pp. 96-102, June 1998, describe a sample-based talking head system that uses morphing to generate intermediate appearances of mouth shapes from a very small set of manually selected mouth samples. While morphing generates smooth transitions between mouth samples, this system does not model the whole head and does not synthesize head movements and facial expressions. Others have presented a sample-based talking head that uses several layers of 2D bit-planes as a model. Neither facial parts nor the whole head are modeled in 3D and, therefore, the system is limited in what new expressions and movements it can synthesize.
Thus, a need remains in the art for a method of modeling three-dimensional objects in general and,

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of modeling objects to synthesize three-dimensional,... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of modeling objects to synthesize three-dimensional,..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of modeling objects to synthesize three-dimensional,... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3019967

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.