Television – Two-way video and voice communication – User positioning
Reexamination Certificate
2002-04-23
2004-08-03
Ramakrishnaiah, Melur (Department: 2643)
Television
Two-way video and voice communication
User positioning
C348S014080, C382S118000
Reexamination Certificate
active
06771303
ABSTRACT:
TECHNICAL FIELD
This invention relates to video conferencing, and more particularly, to correcting for eye-gaze between each viewer and the corresponding image or images of persons being viewed.
BACKGROUND
A primary concern in video-teleconferencing is a lack of eye contact between conferees. Eye contact is not possible with common terminal configurations, because a camera is placed at the perimeter of the display that images a distant conferee, so that the camera does not interfere with a local conferee's viewing of the display. In a typical desktop video-teleconferencing setup, a camera and the display screen cannot be physically aligned. In other words, in order for the participant to make eye contact with the camera, the user must shift his eyes from the display terminal and look upward towards the camera. But, in order for the participant to see who he is viewing, he must look straight at the display terminal and not directly into the camera.
As a result, when the participant looks directly at the display terminal, images of the user received by the camera appear to show that the participant is looking down with a peculiar eye-gaze. With this configuration the conferees fail to look directly into the camera, which results in the appearance that the conferees are looking away or down and appear disinterested in the conversation. Accordingly, there is no direct eye-to-eye contact between participants of the typical desktop video-teleconferencing setup video conferencing system.
One solution for this eye-gaze phenomenon is for the participants to sit further away from their display screens. Research has shown that if the divergence angle between the camera on the top of a 21-inch monitor and the normal viewing position is approximately 20 inches away from the screen, the divergence angle will be 17 degrees, well above the threshold (5 degrees) at which eye-contact can be maintained. Sitting far enough away from the screen (several feet) to meet the threshold, however, ruins much of the communication value of video communication system and becomes almost as ineffective as speaking to someone on a telephone.
Several systems have been proposed to reduce or eliminate the angular deviation using special hardware. One commonly used hardware component to correct for eye-gaze in video conferencing is to use a beam-splitter. A beam-splitter is a semi-reflective transparent panel sometimes called a one way mirror, half-silvered mirror or a semi-silvered mirror. The problem with this and other similar hardware solutions is that they are very expensive and require bulky setup.
Other numerous solutions to create eye-contact have been attempted through the use of computer vision and computer graphics algorithms. Most of these proposed solutions suffer from poor image capture quality, poor image display quality, and excessive expense in terms of computation and memory resources.
SUMMARY
A system and method for correcting eye-gaze in video teleconferencing systems is described. In one implementation, first and second video images representative of a first conferee taken from different views are concurrently captured. A head position of the first conferee is tracked from the first and second video images. Matching features and contours from the first and second video images are ascertained. The head position as well as the matching features and contours from the first and second video images are synthesized to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream.
The following implementations, therefore, introduce the broad concept of correcting for eye-gaze by blending information captured from a stereoscopic view of the conferee and generating a virtual image video stream of the conferee. A personalized face model of the conferee is used to track head position of the conferee.
REFERENCES:
patent: 5359362 (1994-10-01), Lewis et al.
patent: 6072496 (2000-06-01), Guenter et al.
patent: 6304288 (2001-10-01), Hamagishi
U.S. patent application Ser. No. 09/528,827, Microsoft Corporation, filed Mar. 20, 2000.
Sumit Basu, Irfan Essa, Alex Pentland, “Motion Regularization for Model-Based Head Tracking.” In Proceedings of International Conference on Pattern Recognition, Wien, Austria, 1996 IEEE, pp. 611-616.
Michael J. Black, Yaser Yacoob, “Tracking and Recognizing Rigid and Non-Rigid Facial Motions using Local Parametric Models of Image Motion.” In Proceedings of International Conference on Computer Vision, pp. 374-381, Cambridge, MA, 1995 IEEE.
C. Choi, K. Aizawa, H. Harashima & T. Takebe, “Analysis and Synthesis of Facial Image Sequences in Model-Based Image Coding,” IEEE Circuits and Systems for Video Technology, vol. 4, No. 3, Jun. 1994, pp. 257-275.
T. Darrell, B. Moghaddam & A. Pentland, “Active Face Tracking and Pose Estimation in an Interactive Room.” In IEEE Computer Vision and Pattern Recognition, pp. 67-72, 1996.
D. Decarlo, D. Metaxas, “Optical Flow Constraints on Deformable Models with Applications to Face Tracking,” International Journal of Computer Vision 38(2), 99-127, 2000.
T. Horprasert, Y. Yacoob & L. S. Davis, “Computing 3-D Head Orientation from a Monocular Image Sequence.” In International Conference Automatic Face and Gesture Recognition, pp. 242-247, 1996.
Z. Liu, Z. Zhang, C. Jacobs, M. Cohen, “Rapid Modeling of Animated Faces From Video.” In the Third International Conference on Visual Comupting (Visual 2000), pp. 58-67, Mexico City, Sep. 2000. Also available as Technical Report MSR-TR-99-21.
R. Newman, Y. Matsumoto, S. Rougeaux & A. Zelinsky, “Real-Time Stereo Tracking for Head Pose and Gaze Estimation.” In Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), pp. 122-128, Grenoble, France, 2000.
J. Shi & C. Tomasi, “Good Features to Track.” In the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 593-600, Washington, Jun. 1994.
H. Li, P. Roivainen, & R. Forchheimer, “3-D Motion Estimation in Model-Based Facial Image Coding,” IEEE Pattern Analysis and Machine Intelligence, 15(6):545-555, Jun. 1993.
Chen, E. and Williams, L., “View Interpolation for Image Synthesis,” in Siggraph, 1993, pp. 1-7.
Cox, Ingemar J. et al., “A Maximum Likelihood Stereo Algorithm,” Computer Vision and Image Understanding, 63:3, May 1996, pp. 542-567 (1-47).
Gemmell, Jim et al., “Gaze Awareness for Video-conferencing: A Software Approach,” IEEE MultiMedia, Oct.-Dec., 2000, pp. 26-35.
Kolmogorov, Vladimir et al., “Multi-camera Scene Reconstruction via Graph Cuts, ” In Proc. Europ. Conf. Computer Vision, Copenhagen, Denmark, May 2002, pp. 1-16.
Ohta, Yuichi et al., “Stereo by Intra-and Inter-Scanline Search Using Dynamic Programming,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-7, No. 2, Mar. 1985, pp. 139-154.
Roy, Sebastien et al., “A Maximum-Flow Formulation of the N-camera Stereo Correspondence Problem,” IEEE Proc. of Int. Conference on Computer Vision, Bombai, Jan. 1998, pp. 492-499.
Scharstein, Daniel et al., “A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms,” Int. J. Computer Vision, 47(1-3); 7-42, 2002, (10 pages).
Vetter, Thomas, “Synthesis of novel views from a single face image,” Max-Planck-Institut, Germany, Technical Report No. 26, Feb. 1996, pp. 1-13.
Sun, Jian et al., “Stereo Matching Using Belief Propagation,”A. Heyden et al. (Eds): ECCV 2002, LNCS 2351, pp. 510-524, 2002.
Belhumeur, Peter N. et al., “A Bayesian Treatment of the Stereo Correspondence Problem Using Half-Occluded Regions,” in IEEE Conf. On Computer Vision and Pattern Recognition, 1992, 8 pages.
Ishikawa, Hiroshi et al., “Occlusions, Discontinuities, and Epipolar Lines in Stereo, ” in the Fifth European Conference on Computer Vision (ECCV '98), 2-6 Jun. 1998, Freiburg, Germany, pp. 1-14.
Szeliski, Richard, “Prediction Error as a Quality Metric for Motion and Stereo,” Vision Technology Group, Microsoft Research, Sep. 20-25, 1999, Proceeding of the Int&
Yang Ruigang
Zhang Zhengyou
Lee & Hayes PLLC
Microsoft Corporation
Ramakrishnaiah Melur
LandOfFree
Video-teleconferencing system with eye-gaze correction does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Video-teleconferencing system with eye-gaze correction, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Video-teleconferencing system with eye-gaze correction will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3325770