Television – Image signal processing circuitry specific to television – Motion dependent key signal generation or scene change...
Reexamination Certificate
1994-06-02
2001-08-07
Le, Vu (Department: 2613)
Television
Image signal processing circuitry specific to television
Motion dependent key signal generation or scene change...
C348S589000
Reexamination Certificate
active
06271892
ABSTRACT:
TECHNICAL FIELD
This invention relates generally to a method of compressing a sequence of information-bearing frames, and more particularly to a method of compressing a sequence of information-bearing frames having at least two media components such as a video program.
BACKGROUND
Multimedia sources of information such as video programs are one form of multimedia data composed of at least two distinct media components. For example, a video program is composed of a full motion video component and an audio component. A number of methods are known for reducing the large storage and transmission requirements of the video component of video programs. For example, certain compression methods (such as JPEG) take advantage of spatial redundancies that exist within an individual video frame to reduce the number of bytes required to represent the frame. Additional compression may be achieved by taking advantage of the temporal redundancy that exists between consecutive frames, which is the basis for known compression methods such as MPEG. These known compression methods generate a fixed number of frames per unit time to preserve the motion information contained in the video program.
In contrast to the compression methods mentioned above, other methods compress video programs by selecting certain frames from the entire sequence of frames to serve as representative frames. For example, a single frame may be used to represent the visual information contained in any given scene of the video program. A scene may be defined as a segment of the video program over which the visual contents do not change significantly. Thus, a frame selected from the scene may be used to represent the entire scene without losing a substantially large amount of information. A series of such representative frames from all the scenes in the video program provides a reasonably accurate representation of the entire video program with an acceptable degree of information loss. These compression methods in effect perform a content-based sampling of the video program. Unlike the temporal or spatial compression methods discussed above in which the frames are uniformly spaced in time, a content-based sampling method performs a temporally non-uniform sampling of the video program to generate a set of representative frames. For example, a single representative frame may represent a long segment of the video program (e.g., a long scene in which a person makes a speech without substantially changing position for an extended period) or a very short segment of the video program (e.g., a scene displayed in the video program for only a few seconds).
Methods for automatically generating representative images from video programs are known. These methods may detect the boundaries between consecutive shots and may additionally detect scene changes that occur within the individual shots. An example of a method for locating abrupt and gradual transitions between shots is disclosed in patent application Ser. No. 08/171,136, filed Dec. 21, 1993, and entitled “Method and Apparatus for Detecting Abrupt and Gradual Scene Changes In Image Sequences,” the contents of which are hereby incorporated by reference. A method for detecting scene changes that occur within individual shots has been disclosed in patent application Ser. No. 08/191,234, filed Feb. 4, 1994, entitled “Camera-Motion Induced Scene Change Detection Method and System,” the contents of which are also hereby incorporated by reference.
Content-based sampling methods are typically employed for indexing purposes because the representative frames generated by such methods can efficiently convey the visual information contained in a video program. However, these methods fail to convey all the useful information contained in a multimedia format such as video because they only compress one media component, namely, in the case of video, the video component, while excluding the remaining media component (e.g., audio) or components.
SUMMARY
The present invention provides an apparatus and method for compressing a sequence of frames having at least first and second information-bearing media components. The sequence of frames may constitute, for example, a video program in which the first information-bearing component is a video component and the second information-bearing component is a closed-caption component. In accordance with the invention, a plurality of representative frames are selected from among the sequence of frames. The representative frames represent information contained in the first information-bearing media component. A correspondence is then formed between each of the representative frames and one of a plurality of segments of the second information-bearing media component. The representative frames, the plurality of segments of the second information-bearing media component and the correspondence between them are recorded for subsequent retrieval.
In one embodiment of the invention, the representative frames are selected by sampling the sequence of frames in a content-based manner. For example, if the first information-bearing media component is a video component composed of a plurality of scenes, a representative frame may be selected from each scene. Additionally, if the second information-bearing media component is a closed-caption component, a printed rendition of the representative frames and the closed-caption component may be provided. The printed rendition constitutes a pictorial transcript in which each representative frame is printed with a caption containing the closed-caption text associated therewith. One advantage provided by this embodiment of the invention is that while the information embodied in the original format (e.g., a video program) typically requires additional equipment (e.g., a video cassette recorder and monitor) to be understood, the information embodied in the printed pictorial transcript is self-contained and can be understood directly without requiring additional processing or equipment.
In an alternative embodiment of the invention, a method is provided for displaying a compressed rendition of a sequence of frames having at least first and second information-bearing media components. In accordance with the method, a plurality of representative frames are received which represent information contained in the first information-bearing media component. Additionally, a signal is received that has information that forms a correspondence between each of the representative frames and a segment of the second information-bearing media component. Finally, the representative frames and the segment of the second information-bearing media component are displayed in a manner determined by the correspondence therebetween.
REFERENCES:
patent: 4823184 (1989-04-01), Belmares-Sarabia et al.
patent: 5020890 (1991-06-01), Oshima et al.
patent: 5027205 (1991-06-01), Avis et al.
patent: 5032905 (1991-07-01), Koga
patent: 5034816 (1991-07-01), Morita et al.
patent: 5099322 (1992-03-01), Gove
patent: 5134472 (1992-07-01), Abe
patent: 5172281 (1992-12-01), Ardis et al.
patent: 5179449 (1993-01-01), Doi
patent: 5192964 (1993-03-01), Shinohara et al.
patent: 5210559 (1993-05-01), Ohki
patent: 5231492 (1993-07-01), Dangi et al.
patent: 5235419 (1993-08-01), Krause
patent: 5265180 (1993-11-01), Golin
patent: 5267034 (1993-11-01), Miyatake et al.
patent: 5428774 (1995-06-01), Takahashi et al.
patent: 5440336 (1995-08-01), Buhro et al.
patent: 5467288 (1995-11-01), Fasciano et al.
patent: 5471576 (1995-11-01), Yee
patent: 5481296 (1996-01-01), Cragun et al.
PBS Eingineering Report No. E-7709-C, “Television Captioning For The Deaf: Signal and Display Specifications”, John Lentz et al., Revised May 1980, pp. 1-17.
“Knowledge Guided Parsing in Video Databases”, Proc. SPIE Storage and Retrieval for Image and Video Databases (SPIE vol. 1908), D. Swanberg et al. pp. 13-24, San Jose, Feb. 1993.
“Automatic Video Indexing and Full Video Search for Object Appearances” Proc. 2nd Working Conference on Visual Database Systems, (Visual Database Systems II) A Nagasaka et al., Ed. 64,
Gibbon David Crawford
Shahraray Behzad
Freedman Barry H.
Le Vu
Lucent Technologies - Inc.
LandOfFree
Method and apparatus for compressing a sequence of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for compressing a sequence of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for compressing a sequence of... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2441878