Computer graphics processing and selective visual display system – Computer graphics processing – Three-dimension
Reexamination Certificate
1998-04-06
2001-06-19
Vo, Cliff N. (Department: 2671)
Computer graphics processing and selective visual display system
Computer graphics processing
Three-dimension
Reexamination Certificate
active
06249285
ABSTRACT:
BACKGROUND
Techniques for automated recovery of estimated three-dimensional scene structure from multiple two-dimensional images of a visual scene, and the availability of general and special purpose computing engines to support the required calculations, are advancing to a level that makes them practical in a range of design visualization applications. By recovering the estimated scene structure, it is possible to treat the elements of a visual scene as abstract three-dimensional geometric and/or volumetric objects that can be processed, manipulated and combined with other objects within a computer-based system.
One application of these techniques is in media production—the process of creating media content for use in films, videos, broadcast television, television commercials, interactive games, CD-ROM titles, DVD titles, Internet or intranet web sites, and related or derived formats. These techniques can be applied in the pre-production, production and post-production phases of the overall media production process. Other areas include industrial design, architecture, and other design visualization applications.
In creating such design visualization content, it is common to combine various elements and images from multiple sources, and arrange them to appear to interact as if they were in the same physical or synthetic space. It is also common to re-touch or otherwise manipulate images to enhance their aesthetic, artistic or commercial value, requiring the artist to estimate and/or simulate the three-dimensional characteristics of elements in the original visual scene. The recovery of three-dimensional scene structure including camera path data greatly expands the creative possibilities, while reducing the labor-intensive burdens associated with either modeling the elements of the scene by hand or by manipulating the two-dimensional images to simulate three-dimensional interactions.
In these image-based scene analysis techniques, the computer accepts a visual image stream such as produced by a motion picture, film or video camera. The image stream is first converted into digital information in the form of pixels. The computer then operates on the pixels in certain ways by grouping them together, comparing them with stored patterns, and other more sophisticated processes which use when available information such as camera position, orientation, and focal length to determine information about the scene structure. So-called “machine vision” or “image understanding” techniques are then used to automatically extract and interpret the structure of the actual physical scene as represented by the captured images. Computerized abstract models of elements of the scene may then be created and manipulated using this information about the actual physical scene. A computer-generated generated image stream of a synthetic scene can be similarly analyzed.
For example, Becker, S. and Bove, V. M., in “Semiautomatic 3D Model Extraction from Uncalibrated 2-D Camera Views,”
Proceedings SPIE Visual Data Exploration and Analysis II
, vol. 2410, pp. 447-461 (1995) describe a technique for extracting a three-dimensional (3-D) scene model from two-dimensional (2-D) pixel-based image representations as a set of 3-D mathematical abstract representations of visual objects in the scene as well as camera parameters and depth maps.
Horn, B. K. P. and Schunck, B. G., in “Determining Optical Flow,”
Artificial Intelligence
, Vol. 17, pp. 185-203 (1981) describe how so-called optical flow techniques may be used to detect velocities of brightness patterns in an image stream to segment the image frames into pixel regions corresponding to particular visual objects.
Sawhney, H. S., in “3D Geometry from Planar Parallax”, IEEE 1063-6919/94 (1994), pp. 929-934 discusses a technique for deriving 3-D structure through perspective projection using motion parallax defined with respect to an arbitrary dominant plane.
Poelman, C. J. et al in “A Paraperspective Factorization Method for Shape and Motion Recovery”, Dec. 11, 1993, Carnegie Mellon University Report (MU-CS-93-219), elaborates on a factorization method for recovery both the shape of an object and its motion from a sequence of images, using many images and tracking many feature points.
The goal for the creator of multimedia content in using such a scene model is to create as accurate a representation of the scene as possible. For example, consider a motion picture environment where computer-generated special effects are to appear in a scene with real world objects and actors. The content creator may choose to start by creating a model from digitized motion picture film using automatic image-interpretation techniques and then proceed to combine computer-generated abstract elements with the elements derived from image-interpretation in a visually and aesthetically pleasing way.
Problems can occur with this approach, however, since automatic image-interpretation processes are statistical in nature, and the input image pixels are themselves the results of a sampling and filtering process. Consider that images are sampled from two-dimensional (2-D) projections (onto a camera's imaging plane) of three-dimensional (3-D) physical scenes. Not only does this sampling process introduce errors, but also the projection into the 2-D image plane of the camera limits the amount of 3-D information that can be recovered from these images. The 3-D characteristics of objects in the scene, 3-D movement of objects, and 3-D camera movements can typically only be partially quantified from sequences of images provided by cameras.
As a result, image-interpretation processes do not always automatically converge to the correct solution. For example, even though one might think it is relatively straight forward to derive a 3-D mathematical representation of a simple object such as a soda can from sequences of images of that soda can, a process for determining the location and size of a 3-D wire frame mesh needed to represent the soda can may not properly converge, depending upon the lighting, camera angles, and so on used in the original image capture. Because of the probabilistic nature of this type of model, the end result cannot be reliably predicted.
SUMMARY OF THE INVENTION
A key barrier to widespread adoption of these scene structure estimation techniques in the area of design visualization applications such as multimedia production has been the inherent uncertainty and inaccuracy of estimation techniques and the resulting inability to generate aesthetically acceptable imaging. The information encoded in the images about the actual elements in the visual scene and their relationships is incomplete, and typically distorted by artifacts of the automated imaging process. The utility of automated scene structure recovery drops dramatically as the estimates become unreliable and unpredictable wherever the information encoded in the images is noisy, incomplete or simply missing.
The quality and completeness of scene structure recovery can be greatly improved if a human operator provides additional information, corrects estimation errors, and controls parameters to the estimation process. This would be done most efficiently as adjustments, annotations and mark-ups on a display that includes visual representations of parametric values of the images, and of the algorithmic results. The mark-up and information provision can be done on either visual representations of the images, or on visual representations of the algorithmic results, or in fields of parameter values, or any combination of these.
The present invention is a method and apparatus for developing an estimation of the structure of a three-dimensional scene and camera path from multiple two-dimensional images of the scene. The technique involves displaying a visual representation of an estimated three-dimensional scene structure and the values of various parameters associated with the scene, together with a visual representation of at least one two-dimensional image used in the scene structure estimation algorithm. A user inputs
Askey David
Cacciatore Mary
Henry Joseph
Kurtze Jeffrey D.
Madden Paul B.
Hamilton Brook Smith & Reynolds P.C.
SynaPix, Inc.
Vo Cliff N.
LandOfFree
Computer assisted mark-up and parameterization for scene... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer assisted mark-up and parameterization for scene..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer assisted mark-up and parameterization for scene... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2537258