Media browser using multimodal analysis

Computer graphics processing and selective visual display system – Display driving control circuitry – Controlling the condition of display elements

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C345S215000, C345S215000, C345S960000, C707S793000

Reexamination Certificate

active

06366296

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of Invention
This invention relates to a method and apparatus for reviewing an aural and/or visual and/or other representation of a media file. Specifically, the invention relates to using media content features to allow a user to more easily review a media file.
2. Description of Related Art
Text documents often have many cues, such as headings, paragraphs, punctuation, etc., that allow a reader to quickly determine the beginning and end of different sections of the document and to aid the reader in finding areas of interest. However, video and audio browsing systems typically do not provide information to the user regarding simple features, like section beginning and end points, much less more complicated information, like the name of the speaker on a video clip. Such browsing systems typically offer only standard “VCR-type” playback control options, like play, stop, rewind, and fast forward. As anyone who has tried to find a specific video clip on a conventional video tape using a standard VCR will understand, it is often difficult to locate portions of interest in a video using the standard playback controls.
Many techniques exist for extracting information that represents the feature content of a media file. In this application, the term media or media file is used to represent any data stream that contains information regarding video or other image information, audio information, text information and/or other information. A feature of a media file is a property of the video, audio and/or text information in the media file, such as video or audio format, or information relating to the content of the media file, such as the identity of a speaker depicted in a video sequence, occurrences of applause, video shot boundaries, or motion depicted in a video sequence. For example, Pfeiffer et al., “Automatic Audio Content Analysis” ACM MULTIMEDIA 96, Boston, MA, 1996, pp. 21-30; Wilcox, et al., “Segmentation of Speech Using Speaker Identification”, Proc. ICASSP 94, vol. S1, Apr. 1994, pp. 161-164; and Foote, “Rapid Speaker ID Using Discrete MMI Feature Quantisation,” Expert Systems with Applications, vol. 13, no. 4, 1997, pp. 293-289, describe various methods for identifying audio features, such as music, human speech, and speaker identity. Regarding video data, Boreczky et al., “Comparison of Video Shot Boundary Detection Techniques” Proc. SPIE Conf. On Storage and Retrieval for Still Image and Video Databases IV, San Jose, CA, vol. 2670, Feb. 1996, pages 170-179, and Zhang et al., “Automatic Partitioning of Full-motion Video”, Multimedia Systems, vol. 1, no. 1, 1993, pp. 10-28, disclose methods for identifying shot boundaries (radical changes in video content) and motion. As described in these and other similar references, features in a media file can be identified automatically using any of a number of different techniques.
SUMMARY OF THE INVENTION
Providing feature information in a media browsing system can be very useful for a user when identifying areas of interest in a media file, controlling media playback, editing a media file, or performing other operations with a media file. For example, graphically identifying areas in a media file where a particular speaker is shown on a video clip can allow a user to quickly determine and playback those portions that contain the speaker.
Providing feature information to the user based on automatically identified features also eliminates the need for a user to manually index or otherwise mark significant portions of the media file for later retrieval. Thus, the invention can use existing methods for automatically identifying features in a media file to generate and provide feature information to a user to aid the user in browsing the media file.
The invention provides a media browser that uses media feature information as an aid in navigating, selecting, editing, and/or annotating a media file.
In one aspect of the invention, media features are selected by a user.
In one aspect of the invention, media browsing functions, such as play, rewind, stop, fast-forward, index, automatic slide show, and automatic preview, are controlled based on feature information.
In one aspect of the invention, feature information for a selected feature is mapped to a corresponding confidence score.
In one aspect of the invention, the media browser includes a feature indicator that provides information related to a corresponding selected feature based on a corresponding confidence score.
In one aspect of the invention, a feature indicator combines at least two confidence scores and provides information based on the combination.
In one aspect of the invention, a feature indicator provides information related to a confidence score based on a value of another confidence score.
The invention also provides a method for browsing a media file. A feature of the media file being browsed is selected and information related to a confidence score for at least one selected feature is provided. The confidence score relates to the existence of a corresponding selected feature in the media file. Based on the information related to the confidence score, a portion of the media file is selected.
In one aspect of the invention, a metadata value representing a time-wise evaluation of a feature in the media file is mapped to a corresponding confidence score.
In one aspect of the invention, mapping of a metadata value to a corresponding confidence score is non-linear.
In one aspect of the invention, mapping of a metadata value to a confidence score is dependent on a user-defined control value or values.


REFERENCES:
patent: 5136655 (1992-08-01), Bronson
patent: 5493677 (1996-02-01), Balogh et al.
patent: 5579471 (1996-11-01), Barber et al.
patent: 5655058 (1997-08-01), Balasubramanian et al.
patent: 5664227 (1997-09-01), Mauldin et al.
patent: 5708767 (1998-01-01), Yeo et al.
patent: 5729471 (1998-03-01), Jain et al.
patent: 5893110 (1999-04-01), Weber et al.
patent: 5987459 (1999-11-01), Swanson et al.
patent: 6055543 (2000-04-01), Christensen et al.
“Fully-Digital GML-Based Authoring and Delivery System for Hypermedia,” IBM Technical Disclosure Bulletin, vol. 35, No. 2, p. 458-463, Jul. 1992.*
Ulrich Thiel, “Multimedia management and query processing issues in distributed digital libraries: a HERMES perspective,” IEEE, p. 84-89, Jul. 1992.*
M.G. Christal, M.A. Smith, C.R. Taylor, and D.B. Winkler, Evolving Video Skims into Useful Multimedia Abstraction,: in Human Factors in Computing Systems, CHI 94 Conference Proceedings (Los Angeles, CA), New York: ACM, pp. 171-178, 1998.
A. Hampapur, A. Gupta, B. Horowitz, C.-F. Shu, C. Fuller, J. Bach, M. Gorkani, R. Jain, “Virage Video Engine,” in Storage and Retrieval for Still Image and Video Databases V, Proc. SPIE 3022 (San Jose, CA), pp. 188-197, 1997.
A.G. Hauptmann, M.J. Witbrock, A.I. Rudnicky, S. Reed, “Speech for Multimedia Information Retrieval,” in Proceedings of the ACM Symposium on User Interface Software and Technology, UIST'95 (Pittsburgh, PA), New York: ACM, pp. 79-80, 1995.
T. Shimizu, S.W. Smoliar, J. Boreczky, “AESOP: An Outline-Oriented Authoring System,” In 31stAnnual HICSS Conference, vol. 2, pp. 207-215, Jan. 1998.
Y. Tonomura, A. Akutsu, K. Otsuji, T. Sadakata, “VideoMAP and VideoSpaceIcon: Tools for Anatomizing Video Content,” In Proc. ACM Interchi '93, pp. 131-141, 1993.
M.M. Yeung, B.L. Yeo, W. Wolf and B. Liu, “Video Browsing using Clustering and Scene Transitions on Compressed Sequences”, inSPIEvol. 2417 Multimedia Computing and Networking 1995, pp. 399-413, 1995.
F. Arman, R. Depommier, A. Hsu, M.-Y. Chiu, “Content-based Browsing of Video Sequences,”Proc. ACM Multimedia 94,San Francisco, Oct. 1994, pp. 97-103.
M.A. Hearst, “TileBars: Visualization of Term Distribution Information in Full Text Information Access.” InProc. ACM SIGCHI,pp. 59-66, May, 1995.
M. G. Brown, J. T. Foote, G. J. F. jones, K. Spärk Jones and S. J. Young, “Automatic Content-Based Retrieval of Broadcast News.”In Proc. ACM Multimedia 95,pp. 35-43, San Francisco, Nov. 1995.
J. T. Foote, “An O

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Media browser using multimodal analysis does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Media browser using multimodal analysis, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Media browser using multimodal analysis will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2842074

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.