Method for the automatic computerized audio visual dubbing...

Computer graphics processing and selective visual display system – Computer graphics processing – Animation

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C345S949000, C345S957000, C345S956000, C707S793000, C707S793000

Reexamination Certificate

active

06492990

ABSTRACT:

Portions of this application are contained on compact disc(s) the contents of which are entirely incorporated herein by reference. The compact discs are labeled as Copy 1, and Copy 2, respectively. The compact discs are identical and each includes the following ASCII files:
File Name
File Size
Creation Date
dub.c
27937
Nov. 4, 2001 3:47 pm
general.c
26904
Nov. 4, 2001 3:47 pm
gimg.h
2518
Nov. 4, 2001 3:47 pm
io.c
1299
Nov. 4, 2001 3:47 pm
list.h
1941
Nov. 4, 2001 3:47 pm
mv.h
4470
Nov. 4, 2001 3:47 pm
texture.c
2984
Nov. 4, 2001 3:47 pm
FIELD OF THE INVENTION
The present invention relates to a method for automatic audio visual dubbing. More specifically the said invention relates to an efficient computerized automatic method for audio visual dubbing of movies by computerized image copying of the characteristic features of the lips movements of the dubber onto the mouth area of the original speaker. The present invention uses a method of vicinity-searching, three-dimensional head modeling of the original speaker, and texture mapping techniques in order to produce the new images which correspond to the dubbed sound track.
The invention overcomes the well known disadvantage of the correlation problems between lips movement in the original movie and the sound track of the dubbed movie.
DEFINITIONS OF TERMS RELATED TO THE INVENTION
First are provided some definitions of important key words employed in this specification.
Actor (the original actor)—an actor, speaker, singer, animated character, animal, an object in a movie, or a subject in a still photograph.
Audio visual Dubbing—Manipulating, in one or more frames, the mouth area of the actor so that its status will be similar as much as possible to that of the dubber in the reference frame.
Correlation Function—A function describing the similarity of two image regions. The higher the correlation, the better is the match.
Dubber—The person or persons, who speak
arrate/sing/interpret the target text. The dubber can be the same as the actor.
Dubbing—Replacing part or all of one or more of the original sound tracks of a movie, with its original text or sounds (including the case of the silent track of a still photograph), by another sound track containing the target text and/or sound.
Edge Detector—A known image processing technique used to extract boundaries between image regions which differ in intensity and/or color.
Face Parametrization—A method that numerically describes the structure, location, and expression of the face.
Head Model—A three-dimensional wire frame model of the face that is controlled by numerous parameters that describe the exact expression produced by the model (i.e. smile, mouth width, jaw opening, etc.).
Movie (the original movie)—Any motion picture (e.g. cinematic feature film, advertisement, video, animated cartoon, still video picture, etc.). A sequence of consecutive pictures (also called frames) photographed in succession by a camera or created by an animator. In the case of the movie being a still photograph, all of the consecutive pictures are identical to each other. When shown in rapid succession an illusion of natural motion is obtained, except for the case of still pictures. A sound-track is associated with most movies, which contains speech, music, and/or sounds, and which is synchronized with the pictures, and in particular where the speech is synchronized with the lip movements of the actors in the pictures. Movies are realized in several techniques. Common methods are: (a) recording on film, (b) recording in analog electronic form (“video”), (c) recording in digital electronic form, (d) recording on chips, magnetic tape, magnetic disks, or optical disks, and (e) read/write by magnetic and/or optical laser devices. Finally, in our context, an “original movie” is also an audio-visual movie altered by the present invention, which serves as a base for further alterations.
Original Text—A text spoken or sung by the actor when the movie is being made, and which is recorded on its sound track. The text may be narrated in the background without showing the speaker, or by showing a still photograph of the speaker.
Pixel—Picture element. A digital picture is composed of an array of points, called pixels. Each pixel encodes the numerical values of the intensity and of the color at the corresponding picture point.
Reference Similarity Frame—A picture (being a frame in the original movie, a frame in any other movie, or a still photograph) in which the original actor has the desired features of the mouth-shape and head posture suitable for the audio visually dubbed movie.
Target Text—A new vocal text, to replace the original vocal text of the actor. The target text may also be that which is to be assigned to an actor who was silent in the original movie. The new text can be in another language, to which one refers as DUBBING. However, this invention relates also to replacement of text without changing the language, with the original actor or with a dubber in that same language. The target text may have the same meaning as the original text, but may have also a modified, opposite, or completely different meaning. According to one of many applications of the present invention, the latter is employed for creation of new movies with the same actor, without his/her/its active participation. Also included is new vocal text used to replace the null vocal text attached to one or more still photographs.
Texture Mapping—A well known technique in computer graphics which maps texture onto a three-dimensional wire frame model.
Two-Dimensional Projection—The result of the rendering of the three-dimensional face model onto a two-dimensional device like a monitor, a screen, or photographic film.
BACKGROUND OF THE INVENTION
Movies are often played to an audience that is not familiar with the original language, and thus cannot understand the sound track of such movies. Two well known common approaches exist to solve this problem. In one approach sub-titles in typed text of the desired language are added to the pictures, and the viewers are expected to hear the text in a foreign language and simultaneously to read its translation on the picture itself. Such reading distracts the viewers from the pictures and from the movie in general. Another approach is dubbing, where the original sound-track with the original text is being replaced by another sound-track with the desired language. In this case there is a disturbing mis-match between the sound-track and the movements of the mouth.
There have been some earlier attempts to overcome these disadvantages, none of which have been commercialized because of inherent principal difficulties which made the practical execution unrealistic. Thus, in U.S. Pat. No. 4,600,281 a method is described which performs the measurements of the shape of the mouth manually by a ruler or with a cursor, and corrects the mouth shape by moving pixels within each frame. As will be seen in the description of the invention, the method according to the present invention is inherently different and much superior in the following points: In the present invention the tracking of the shape of the mouth is done automatically and not manually. In the present invention changing the shape of the mouth is done by using a three-dimensional head model, for example like those described by P. Ekman and W. V. Friesen, (Manual for the Facial Action Unit system, consulting Psychologist Press, Palo Alto 1977). In the present invention the mouth area of the actor is replaced using the mouth area of a reference similarity frame. In the present invention mouth status parameters of the dubber are substituted for mouth status parameters of the actor.
The U.S. Pat. No. 4,260,229 relates to a method of graphically creating lip images. This U.S. patent is totally different from the present invention: In the U.S. patent, speech sounds are analyzed and digitally encoded. In the present invention no sound analysis is done; nor is any required at all.
To make for better viewing of the audio visually dubbed movie, the present invention p

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for the automatic computerized audio visual dubbing... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for the automatic computerized audio visual dubbing..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for the automatic computerized audio visual dubbing... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2956336

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.