Computer graphics processing and selective visual display system – Computer graphics processing – Graph generating
Reexamination Certificate
1998-03-13
2001-07-24
Zimmerman, Mark (Department: 2671)
Computer graphics processing and selective visual display system
Computer graphics processing
Graph generating
C345S422000, C345S426000, C382S284000
Reexamination Certificate
active
06266068
ABSTRACT:
FIELD OF THE INVENTION
This invention relates generally to video synthesis, and more particularly to a method and apparatus for synthesizing video from multiple input layers or channels of image data.
BACKGROUND
Many applications, such as video editing, computer games, computer graphics for entertainment, and multimedia authoring, are based on the synthesis of video from a wide variety of sources of input data. In computer games, for example, the rendering of texture-mapped 3-D models at video rates is key to the realism of the game. In video editing applications, what differentiates on-line systems from off-line systems is the ability to composite multiple video streams in real-time. Video synthesis from still images can be found in multimedia CD's based on Apple Computer's QuickTime VR™, where virtual camera motion is synthesized using cylindrical panoramic mosaics constructed from sets of still images. Currently, such applications use highly specialized video synthesis techniques that are usually restricted to a single input data type and, generally, are based either on the use of 3-D models or on 2-D video representations that cannot support a full range of geometrically correct 3-D effects.
The effort involved in constructing 3-D graphical models presents a significant barrier to their widespread use. Many modern computer graphics systems, for example, are based on rendering texture-mapped polygons. Often, however, large numbers of polygons are required for visual realism, making 3-D model creation difficult and time-consuming. The technical challenge in producing 3-D models from images is even greater when integrating information from multiple views into a single 3-D representation.
Video editing systems, on the other hand, can combine multiple video streams using 2-D techniques, such as alpha blending, to generate video output without specifying complex 3-D models. An example of such a technique can be found in Kurtze et al., U.S. Pat. No. 5,644,364. Video editing can typically support image operations like translation, zooming, and planar warps. However, such systems lack a complete representation of the geometry of the scenes described by the video sequences. As a result, the types of 3-D effects that they can provide are extremely limited. For example, video editing systems cannot simulate a virtual change in the camera position in a manner that is guaranteed to be geometrically correct. Further, although these systems can handle occlusions by organizing several video streams into layers, they cannot handle self-occlusions within a layer, or more complex occlusion relations between layers. Consequently most 3-D effects are rendered off-line and then mixed in. Mosaic-based systems, such as Apple Computer's QuickTime VR (i.e., video synthesis that uses still images of a scene taken at different camera positions), can accurately simulate camera rotation and zooming, but cannot simulate virtual camera views with arbitrary translations because of limitations in the mosaic representation of scene geometry.
Image-based rendering (IBR) presents a compelling approach to image synthesis. IBR provides an alternative to the difficult process of building 3-D models from images, allowing the synthesis of new images of a static scene directly from a set of images. The 3-D geometric information is computed as needed while rendering a particular virtual view. This computation can operate at any desired level of detail and can therefore be adapted to the needs of the application. Moreover, IBR can produce high quality images even when the number of available sample images for a scene is small. While this dearth of image samples could frustrate the construction of a 3-D model, IBR can still produce new viewpoints in the vicinity of the image samples.
Although image-based rendering is a compelling approach to image synthesis, limitations in the current state of the art prevent their wide-spread application to video synthesis. The standard approach to image-based rendering, as described for example in “Novel View Synthesis in Tensor Space”, by Avidan et al. in
Conference on Computer Vision and Pattern Recognition
, pp. 1034-1040, San Juan, Puerto Rico, June 1997, assumes that the motion in a set of input images results solely from the motion of the camera with respect to a static scene. In practice, however, there may be multiple rigid objects in a scene, each moving independently with respect to the camera. Moreover, some of these objects may even be articulated with non-rigid, kinematically-controlled motion. Thus, the standard IBR methods would be unable to synthesize such scenes.
There remains a need, therefore, for a method and apparatus that provide the advantages of IBR over current video synthesis techniques, such as 3-D modeling, video editing, and mosaic-based rendering, but are not limited to scenes with only a single rigid body in motion.
SUMMARY OF THE INVENTION
The present invention relates to a computerized method and a computer system for synthesizing video from a plurality of sources of image data. Each source provides image data associated with an object. In terms of the computerized method, image data associated with a first object is provided from a first source, and image data associated with a second object is provided from a second source. The image data of the first and second objects are combined to generate composite images of the first and second objects. From the composite images, an output image of the first and second objects as viewed from an arbitrary viewpoint is generated.
In one aspect, the method finds a pixel in the output image with an unspecified pixel value, and determines which one of the sources of image data should provide a pixel value for the unspecified pixel value. Generally, the system
100
can combine layers of a single model type or mixed model types. For example, the system
100
can combine image layers with image layers, video layers with video layers, and 3-D model-based layers with 3-D model-based layers.
In other aspects, the method can combine layers of different types, such as image layers with video layers, image layers with 3-D model-based layers, video layers with 3-D model-based layers, and image layers with video layers and 3-D model-based layers.
In terms of the computer system, a composite image generator combines the image data associated with the objects to generate composite images of the objects, and a view generator generating from the composite images an output image of the objects as viewed from an arbitrary viewpoint. In one aspect of the computer system, the view generator finds a pixel in the output image with an unspecified pixel value and determines which one of the sources of image data should provide a pixel value for the unspecified pixel value.
In other aspects of the computer system, layers of different types can be combined, such as image layers with video layers, image layers with 3-D model-based layers, video layers with 3-D model-based layers, and image layers with video layers and 3-D model-based layers.
REFERENCES:
patent: 5175805 (1992-12-01), Carrie
patent: 5295234 (1994-03-01), Ishida et al.
patent: 5488674 (1996-01-01), Burt et al.
patent: 5557684 (1996-09-01), Wang et al.
patent: 5644364 (1997-07-01), Kurtze et al.
patent: 5656737 (1997-08-01), Wistow
patent: 5657402 (1997-08-01), Bender et al.
patent: 5706417 (1998-01-01), Adelson
patent: 5850352 (1998-12-01), Moezzi et al.
Avidan et al., “Novel View Synthesis in Tensor Space,” IEEE Computer Society Conference on Computer Vision and Patter Recognition, Jun. 1997, Puerto Rico.
Debevec et al., “Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-ased approach,” Computer Graphics Proceedings, Annual Conference Series, 1996.
Faugeras, O.,Three-Dimensional Computer Vision: A Geometric Viewpoint (Artificial Ingelligence), Chapter 6: “Stereo Vision,” pp. 165-176.
Fuchs, H., “On Visible Surface Generation By a Priori Tree Structures,” ACM SIGGRAPH 1980.
Greene et al., “Creating Raster Omnimax Images fro
Kang Sing Bing
Rehg James M.
Compaq Computer Corporation
Hamilton, Brook, Smith and Reynolds, P.C.
Nguyen Kimbinh T.
Zimmerman Mark
LandOfFree
Multi-layer image-based rendering for video synthesis does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multi-layer image-based rendering for video synthesis, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multi-layer image-based rendering for video synthesis will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2527323