Computer graphics processing and selective visual display system – Animation processing method – Language driven animation
Reexamination Certificate
1997-10-02
2001-10-23
Luu, Matthew (Department: 2672)
Computer graphics processing and selective visual display system
Animation processing method
Language driven animation
C345S951000, C345S955000, C345S473000
Reexamination Certificate
active
06307576
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of Invention
This invention relates generally to animation producing methods and apparatuses, and more particularly is directed to a method for automatically animating lip synchronization and facial expression for three dimensional characters.
2. Description of the Related Art
Various methods have been proposed for animating lip synchronization and facial expressions of animated characters in animated products such as movies, videos, cartoons, CD's, and the like. Prior methods in this area have long suffered from the need of providing an economical means of animating lip synchronization and character expression in the production of animated products due to the extremely laborious and lengthy protocols of such prior traditional and computer animation techniques. These shortcomings have significantly limited all prior lip synchronization and facial expression methods and apparatuses used for the production of animated products. Indeed, the limitations of cost, time required to produce an adequate lip synchronization or facial expression in an animated product, and the inherent limitations of prior methods and apparatuses to satisfactorily provide lip synchronization or express character feelings and emotion, leave a significant gap in the potential of animated methods and apparatuses in the current state of the art.
Time aligned phonetic transcriptions (TAPTS) are a phonetic transcription of a recorded text or soundtrack, where the occurrence in time of each phoneme is also recorded. A “phonemes” is defined as the smallest unit of speech, and corresponds to a single sound. There are several standard phonetic “alphabets” such as the International Phonetic Alphabet, and TIMIT created by Texas Instruments, Inc. and MIT. Such transcriptions can be created by hand, as they currently are in the traditional animation industry and are called “x” sheets, or “gray sheets” in the trade. Alternatively such transcriptions can be created by automatic speech recognition programs, or the like.
The current practice for three dimensional computer generated speech animation is by manual techniques commonly using a “morph target” approach. In this practice a reference model of a neutral mouth position, and several other mouth positions, each corresponding to a different phoneme or set of phonemes is used. These models are called “morph targets”. Each morph target has the same topology as the neutral model, the same number of vertices, and each vertex on each model logically corresponds to a vertex on each other model. For example, vertex #n on all models represents the left corner of the mouth, and although this is the typical case, such rigid correspondence may not be necessary.
The deltas of each vertex on each morph target relative to the neutral are computed as a vector from each vertex n on the reference to each vertex n on each morph target. These are called the delta sets. There is one delta set for each morph target.
In producing animation products, a value usually from 0 to 1 is assigned to each delta set by the animator and the value is called the “morph weight”. From these morph weights, the neutral's geometry is modified as follows: Each vertex N on the neutral has the corresponding delta set's vertex multiplied by the scalar morph weight added to it. This is repeated for each morph target, and the result summed. For each vertex v in the neutral model:
&LeftBracketingBar;
result
&RightBracketingBar;
=
&LeftBracketingBar;
neutral
&RightBracketingBar;
+
∑
x
=
1
n
⁢
&LeftBracketingBar;
delta
⁢
⁢
set
x
&RightBracketingBar;
*
morph
⁢
⁢
weight
x
|delta set
x
|*morph weight
x
where the symbol |xxx| is used to indicate the corresponding vector in each referenced set. For example, Iresult is the corresponding resultant vertex to vertex v in the neutral model |neutral| and |delta set
x
| is the corresponding vector for delta set x.
If the morph weight of the delta set corresponding to the morph target of the character saying, for example, the “oh” sound is set to 1, and all others are set to 0, the neutral would be modified to look like the “oh target. If the situation was the same, except that the “oh” morph weight was 0.5, the neutral's geometry is modified half way between neutral and the “oh” morph target.
Similarly, if the situation was as described above, except “oh” weight was 0.3 and the “ee” morph weight was at 0.7, the neutral geometry is modified to have some of the “oh” model characteristics and more of the “ee” model characteristics. There also are prior blending methods including averaging the delta sets according to their weights.
Accordingly, to animate speech, the artist needs to set all of these weights at each frame to an appropriate value. Usually this is assisted by using a “keyframe” approach, where the artist sets the appropriate weights at certain important times (“keyframes”) and a program interpolates each of the channels at each frame. Such keyframe approach is very tedious and time consuming, as well as inaccurate due to the large number of keyframes necessary to depict speech.
The present invention overcomes many of the deficiencies of the prior art and obtains its objectives by providing an integrated method embodied in computer software for use with a computer for the rapid, efficient lip synchronization and manipulation of character facial expressions, thereby allowing for rapid, creative, and expressive animation products to be produced in a very cost effective manner.
Accordingly, it is the primary object of this invention to provide a method for automatically animating lip synchronization and facial expression of three dimensional characters, which is integrated with computer means for producing accurate and realistic lip synchronization and facial expressions in animated characters. The method of the present invention further provides an extremely rapid and cost effective means to automatically create lip synchronization and facial expression in three dimensional animated characters.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
SUMMARY OF THE INVENTION
To achieve the foregoing objects, and in accordance with the purpose of the invention as embodied and broadly described herein, a method is provided for controlling and automatically animating lip synchronization and facial expressions of three dimensional animated characters using weighted morph targets and time aligned phonetic transcriptions of recorded text, and other time aligned data. The method utilizes a set of rules that determine the systems output comprising a stream or streams of morph weight sets when a sequence of timed phonemes or other timed data is encountered. Other timed data, such as pitch, amplitued, noise amounts, or emotional state data or emotemes such as “surprise, “disgust, “embarrassment”, “timid smile”, or the like, may be inputted to affect the output stream of morph weight sets.
The methodology herein described allows for automatically animating lip synchronization and facial expression of three dimensional characters in the creation of a wide variety of animation products, including but not limited to movies, videos, cartoons, CD's, software, and the like. The method and apparatuses herein described are operably integrated with computer software and hardware.
In accordance with the present invention there also is provided a method for automatically animating lip synchronization and facial expression of three dimensional characters for films, videos, cartoons, and other animation products, comprising configuring a set of default correspondence rules between a plurality of visual phoneme groups and a plurali
Luu Matthew
The Hecker Law Group
Yang Ryan
LandOfFree
Method for automatically animating lip synchronization and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for automatically animating lip synchronization and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for automatically animating lip synchronization and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2610632