System and method for measuring similarity between a set of...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06675174

ABSTRACT:

FIELD OF THE INVENTION
This invention relates to the field of multimedia temporal information streams, such as video and audio, and databases of such streams. More specifically, the invention relates to the field of video and audio processing for detecting known reference streams in target streams and for retrieving streams from databases based upon measures of correlation or similarity between two streams. The invention also relates to similarity or correlation measurements of temporal media sequences.
BACKGROUND OF THE INVENTION
Traditional databases handle structured parametric data. Such databases, for example, can contain a lists of employees of a company, their salaries, home addresses, years of service, etc. Information is very easily retrieved from such databases by formulating parametric queries, e.g., ‘retrieve the salary of employee X’ or ‘how many employees with x<salary<y live in city Y.’
Beyond data that can be represented in machine readable tabular form and, of course, machine readable text documents, many other forms of media are transitioning to machine readable digital form. For example, audio data such as speech and music and visual data such as images and video are more and more produced in digital form or converted to digital form. Large collections and catalogues of these media objects need to be organized similarly to structured traditional parametric data using database technology enhanced with new technologies that allow for convenient search and retrieval based on visual or audio content of the media. Such collections of media are managed using multimedia databases where the data that are stored are combinations of numerical, textual, auditory and visual data.
Audio and video are a special, peculiar type of data objects in the sense that there is a notion of time associated with this data. This type of data are referred to as streamed information, streamed multimedia data or temporal media. When transporting this data from one location to some other location for viewing and/or listening purposes, it is important that the data arrives in the right order and at the right time. In other words, if frame n of a video is displayed at time t, frame n+1 has to be at the viewing location at time t plus {fraction (1/30)}th of a second. Of course, if the media are moved or transported for other purposes, there is no such requirement.
Similarly to text documents, which can be segmented into sections, paragraphs and sentences, temporal media data can be divided up into smaller more or less meaningful time-continuous chunks. For video data, these chunks are often referred to as scenes, segments and shots, where a shot is the continuous depiction of space by a single camera between the time the camera is switched on and switched off, i.e., it is an image of continues space-time. In this disclosure, we refer to these temporal, time-continuous (but not necessarily space-continuous) chunks of media as media segments or temporal media segments. These media segments include video and audio segments and, in general, information stream segments. Examples of media segments are commercial segments (or groups) broadcast at regular time intervals on almost every TV channel; a single commercial is another example of a media segment or video segment.
Multimedia databases may contain collections of such temporal media segments in addition to non-streamed media objects such as images and text documents. Associated with the media segments may be global textual or parametric data, such as the director of the video/music (audio) or the date of recording. Searching these global keys of multimedia databases to retrieve temporal media segments can be accomplished through traditional parametric and text search techniques. Multimedia databases may also be searched on data content, such as the amount of green or red in images or video and sound frequency components of audio segments. The databases have to be then preprocessed and the results have to be somehow indexed so that these data computations do not have to be performed by a linear search through the database at query time. Searching audio and video databases on semantic content, the actual meaning (subjects and objects) of the audio and video segments, on the other hand, is a difficult issue. For audio, a speech transcript may be computed using speech recognition technology; for video, speech may be recognized, but beyond that, the situation is much more complicated because of the rudimentary state of the art in machine-iterpretation of visual data.
Determining whether a given temporal media segment is a member or segment, or is similar to a member or segment, of a plurality of temporal media streams or determining whether it is equal or similar to a media segment or equal or similar to a sub segment in a multimedia database is another important multimedia database search or query. A variant here is the issue of determining if a given temporal input media stream contains a segment which is equal or similar to one of a plurality of temporal media stream segments or determining if the input stream contains a segment which is equal or similar to a media segment in a multimedia database. To achieve this one needs to somehow compare a temporal media segment to a plurality of temporal media stream segments or databases of such segments. This problem arises when certain media segments need to be selected or deselected in a given temporal media input stream or in a plurality of temporal media input streams. An example here is the problem of deselecting or suppressing repetitive media segments in a television broadcast program. Such repetitive media segments can be commercials or commercial segments or groups which are suppressed either by muting the sound channel or by both muting the sound channel and blanking the visual channel.
Much of the prior art is concerned with the issue of commercial detection in temporal video streams. The techniques for detecting commercials or other specific program material can be more or less characterized by the method of commercial representation, these representations are: 1) global representations, 2) static frame-based representations, 3) dynamic sequence-based representations. Three examples that use global representation or properties of commercials are:
An example a method and apparatus for detection and identification of portions of temporal video streams containing commercials is described in U.S. Pat. No. 5,151,788 to Blum. Here, a blank frame is detected in the video stream, a timer is set for a given period after detection of a blank frame, and the video stream is tested for “activity” (properties such as sound level, brightness level and average shot length) during the period representative of a commercial advertisement. Here the property of commercials that they start with a blank frame and that the activity of a commercial is different from the surrounding video material are used as global properties of commercials. U.S. Pat. No. 5,696,866 to Iggulden et al. extend the idea of detecting a blank frame, to what they call “flat” frame which has a constant signal throughout a frame or within a window within the frame. In addition to a frame being flat at the beginning and end of a commercial, Iggulden et al. include that the frame has to be silent, i.e., there should be no audio signal during the flat frame. Further, a commercial event is analyzed with respect to surrounding events to determine whether this segment of the video stream is part of a commercial message or part of the regular program material.
U.S. Pat. No. 5,151,788 to Blum and U.S. Pat. No. 5,696,866 to Iggulden et al. detect commercials and commercial groups based on representations of commercials which are coarse and determined by examining global properties of commercials and, most importantly, the fact that a commercial is surrounded by two blank frames. Additional features of the video signal are used in U.S. Pat. No. 5,343,251 to Nafeh. Here features such as changes in the audio power or amplitude and changes in

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for measuring similarity between a set of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for measuring similarity between a set of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for measuring similarity between a set of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3264586

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.