Method and a system for generating summarized video

Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C348S700000, C348S701000

Reexamination Certificate

active

06690725

ABSTRACT:

TECHNICAL FIELD
The present invention relates to a method and a system for video summarization, and in particular to a method and a system for key frame extraction and shot boundary detection.
BACKGROUND OF THE INVENTION AND PRIOR ART
Recent developments in personal computing and communications have created new classes of devices such as hand-held computers, personal digital assistants (PDAs), smart phones, automotive computing devices, and computers that allow users more access to information.
Many of the device manufacturers, including cell phone, PDA, and hand-held computer manufacturers, are working to grow the functionalities of their devices. The devices are being given capabilities of serving as calendar tools, address books, paging devices, global positioning devices, travel and mapping tools, email clients, and Web browsers. As a result, many new businesses are forming around applications related to bringing all kinds of information to these devices. However, due to the limited capabilities of many of these devices, in terms of the display size, storage, processing power, and network access, there are new challenges for designing the applications that allow these devices to access, store and process information.
Concurrent with these developments, recent advances in storage, acquisition, and networking technologies has resulted in large amounts of rich multimedia content. As a result, there is a growing mismatch between the rich content that is available and the capabilities of the client devices to access and process it.
In this respect so called key-frame based video summarization is an efficient way to manage and transmit video information. This representation can be used within the MPEG-7 application Universal Multimedia Access as described in C. Christopoulos et al., “MPEG-7 application: Universal access through content repurporsing and media conversion”, Seoul, Korea, March 1999, ISO/IEC/JTC1/SC29/WG11 M4433, in order to adapt video data to the client devices.
For Audio-Visual material, the key frame extraction could be used in order to adapt to bandwidth and computational capabilities of the clients. For example, low bandwidth or low capability clients might request only the audio information to be delivered, or only he audio combined with some key frames. High bandwidth and computational efficient clients can request the whole AV material. Another application is fast browsing to digital video. Skipping video frames at fixed intervals reduce the video viewing time. However this merely gives a random sample of the overall video.
Below the following definitions will be used:
Shot
A shot is defined as a sequence of frames captured by one camera in a single continuous action in time and space, see also J. Monaco, “How to read a film,” Oxford press, 1981.
Shot Boundary
There are a number of different types of boundaries between shots. A cut is an abrupt shot change that occurs in a single frame. A fade is a gradual change in brightness resulting in (fade-out) or starting with a black frame (fade-in). A dissolve occurs when the images of the first shot become dimmer and the images of the second shot become brighter, with frames within the transition showing one image superimposed on the other one. A wipe occurs when pixels from the second shot replace those of the first shot in a regular pattern such as a line from the left edge of the frames.
Key Frame
Key frames are defined inside every shot. They represent with a small number of frames, the most relevant information content of the shot according to some subjective or objective measurement.
Conventional video summarization consists of two steps:
1. Shot boundary detection.
2. Key-frame extraction.
Many attributes of the frames such as colour, motion and shape have been used for video summarization. The standard algorithm for shot boundary detection in video summarization is based on histograms. Histogram-based techniques are shown to be robust and effective as described in A. Smeulders and R. Jain, “Image databases and Multi-Media search”, Singapore, 1988, and in J. S. Boreczky, and L. A. Rowe, “Comparison of Video Shot Boundary Detection Techniques”,Storage and Retrieval for Image and Video Databases IV, Proc. of IS&T/SPIE 1996 Int'l Symp. on Elec. Imaging: Science and Technology, San Jose, Calif., February 1996.
Thus, the colour histograms of two images are computed. If the Euclidean distance between the two histograms is above a certain threshold, a shot boundary is assumed. However, no information about motion is used. Therefore, this technique has drawbacks in scenes with camera and object motion.
Furthermore, key frames must be extracted from the different shots in order to provide a video summary. Conventional key frame extraction algorithms are for example described in: Wayne Wolf, “Key frame selection by motion analysis”, in Proceedings, ICASSP 96, wherein the optical flow is used in order to identify local minima of motion in a shot. These local minima of motion are then determined to correspond to key frames. In W. Xiong, and J. C. M. Lee, and R. H. Ma, “Automatic video data structuring through shot partitioning and key-frame selection”, Machine vision and Applications, vol.10, no. 2, pp. 51-65, 1997, a seek-and-spread algorithm is used where the previous key-frame as a reference for the extraction of the next key-frame. Also, in R. L. Lagendijk, and A. Hanjalic, and M. Ceccarelli, and M. Soletic, and E. Persoon, “Visual search in a SMASH system”, Proceedings of IEEE ICIP 97, pp. 671-674, 1997, a cumulative action measure of shots in order to compute the number and the position of key-frames allocated to each shot is used. The action between two frames is computed via a histogram-difference. One advantage of this method is that the number of key-frames can be pre-specified.
SUMMARY
It is an object of the present invention to provide a method and a system for shot boundary detection and key frame extraction, which can be used for video summarization and which is robust against camera and object motion.
This object and others are obtained by a method and a system for key frame extraction, where a list of feature points is created. The list keeps track of individual feature points between consecutive frames of a video sequence.
In the case when many new feature points are entered on the list or when many feature points are removed from the list between two consecutive frames a shot boundary is determined to have occurred. A key frame is then selected between two boundary shots as a frame in the list of feature points where no or few feature points are entered or lost in the list.
By using such a method for extracting key frames from a video sequence motion in the picture and/or camera motion can be taken into account. The key frame extraction algorithm will therefore be more robust against camera motion.


REFERENCES:
patent: 5635982 (1997-06-01), Zhang et al.
patent: 5767922 (1998-06-01), Zabih et al.
patent: 5995095 (1999-11-01), Ratakonda
patent: 6366699 (2002-04-01), Kuwano et al.
patent: 6404925 (2002-06-01), Foote et al.
International Conference on Acoustics, Speech, and Signal Proc. (Princeton University), 1996, vol. 2, pp. 1228-1231, Wayne Wolf, “Key Frame Selection by Motion Analysis”.
International Workshop on Multi-Media Database Management. . . , 1998, pp. 80-87, Suchendra M. Bhandarkar et al., Motion-based Parsing Compressed Video.
IEEE International Conference on Multimedia Computing and Systems, 1999, vol. 2, pp. 710-714Alan Hanjalic et al., “Optimal Shot Boundary Detection based on Robust Statistical Models”.
International Conference on Acoustics, Speech and Signal Processing, vol. 2, 1996, Princeton University, Wayne Wolf, Key Frame Selection by Motion Analysis, pp. 1228-1231.
International Workshop on Multi-Media Database Management. . . , vol., 1998, Suchendra M bhandarkar, Aparna A. Khombhadia, “Motion Based Parsing of Compressed Video,” pp. 80-87.
IEEE International Conference on Multimedia Computing and Systems, vol. 2, 1999, Delft University, The Netherlands, Alan Hanjalic, Hong

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and a system for generating summarized video does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and a system for generating summarized video, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and a system for generating summarized video will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3315706

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.