Interactive video distribution systems – Use surveying or monitoring – By passively monitoring receiver operation
Reexamination Certificate
2002-11-27
2004-07-20
Srivastava, Vivek (Department: 2611)
Interactive video distribution systems
Use surveying or monitoring
By passively monitoring receiver operation
C725S018000, C725S022000
Reexamination Certificate
active
06766523
ABSTRACT:
BACKGROUND
1. Technical Field
The invention is related to media stream identification and segmentation, and in particular, to a system and method for identifying and extracting repeating audio and/or video objects from one or more streams of media such as, for example, a media stream broadcast by a radio or television station.
2. Related Art
There are many existing schemes for identifying audio and/or video objects such as particular advertisements, station jingles, or songs embedded in an audio stream, or advertisements or other videos embedded in a video stream. For example, with respect to audio identification, many such schemes are referred to as “audio fingerprinting” schemes. Typically, audio fingerprinting schemes take a known object, and reduce that object to a set of parameters, such as, for example, frequency content, energy level, etc. These parameters are then stored in a database of known objects. Sampled portions of the streaming media are then compared to the fingerprints in the database for identification purposes.
Thus, in general, such schemes typically rely on a comparison of the media stream to a large database of previously identified media objects. In operation, such schemes often sample the media stream over a desired period using some sort of sliding window arrangement, and compare the sampled data to the database in order to identify potential matches. In this manner, individual objects in the media stream can be identified. This identification information is typically used for any of a number of purposes, including segmentation of the media stream into discrete objects, or generation of play lists or the like for cataloging the media stream.
However, as noted above, such schemes require the use of a preexisting database of pre-identified media objects for operation. Without such a preexisting database, identification, and/or segmentation of the media stream are not possible when using the aforementioned conventional schemes.
Therefore, what is needed is a system and method for efficiently identifying and extracting or segmenting repeating media objects from a media stream such as a broadcast radio or television signal without the need to use a preexisting database of pre-identified media objects.
SUMMARY
An “object extractor” as described herein automatically identifies and segments repeating objects in a media stream comprised of repeating and non-repeating objects. An “object” is defined to be any section of non-negligible duration that would be considered to be a logical unit, when identified as such by a human listener or viewer. For example, a human listener can listen to a radio station, or listen to or watch a television station or other media broadcast stream and easily distinguish between non-repeating programs, and advertisements, jingles, and other frequently repeated objects. However, automatically distinguishing the same, e.g., repeating, content automatically in a media stream is generally a difficult problem.
For example, an audio stream derived from a typical pop radio station will contain, over time, many repetitions of the same objects, including, for example, songs, jingles, advertisements, and station identifiers. Similarly, an audio/video media stream derived from a typical television station will contain, over time, many repetitions of the same objects, including, for example, commercials, advertisements, station identifiers, program “signature tunes”, or emergency broadcast signals. However, these objects will typically occur at unpredictable times within the media stream, and are frequently corrupted by noise caused by any acquisition process used to capture or record the media stream.
Further, objects in a typical media stream, such as a radio broadcast, are often corrupted by voice-overs at the beginning and/or end point of each object. Further, such objects are frequently foreshortened, i.e., they are not played completely from the beginning or all the way to the end. Additionally, such objects are often intentionally distorted. For example, audio broadcast via a radio station is often processed using compressors, equalizers, or any of a number of other time/frequency effects. Further, audio objects, such as music or a song, broadcast on a typical radio station are often cross-faded with the preceding and following music or songs, thereby obscuring the audio object start and end points, and adding distortion or noise to the object. Such manipulation of the media stream is well known to those skilled in the art. Finally, it should be noted that any or all of such corruptions or distortions can occur either individually or in combination, and are generally referred to as “noise” in this description, except where they are explicitly referred to individually. Consequently, identification of such objects and locating the endpoints for such objects in such a noisy environment is a challenging problem.
The object extractor described herein successfully addresses these and other issues while providing many advantages. For example, in addition to providing a useful technique for gathering statistical information regarding media objects within a media stream, automatic identification and segmentation of the media stream allows a user to automatically access desired content within the stream, or, conversely, to automatically bypass unwanted content in the media stream. Further advantages include the ability to identify and store only desirable content from a media stream; the ability to identify targeted content for special processing; the ability to de-noise, or clear up any multiply detected objects, and the ability to archive the stream more efficiently by storing only a single copy of multiply detected objects.
As noted above, a system and method for automatically identifying and segmenting repeating media objects in a media stream identifies such objects by examining the stream to determine whether previously encountered objects have occurred. For example, in the audio case this would mean identifying songs as being objects that have appeared in the stream before. Similarly in the case of video derived from a television stream it can involve identifying specific advertisements, as well as station “jingles” and other frequently repeated objects. Further, such objects often convey important synchronization information about the stream. For example the theme music of a news station conveys time and the fact that the news report is about to begin or has just ended.
For example, given an audio stream which contains objects that repeat and objects that do not repeat, the system and method described herein automatically identifies and segments repeating media objects in the media stream, while identifying object endpoints by a comparison of matching portions of the media stream or matching repeating objects. Using broadcast audio, i.e. radio, as an example, “objects” that repeat may include, for example, songs on a radio music station, call signals, jingles, and advertisements.
Examples of objects that do not repeat may include, for example, live chat from disk jockeys, news and traffic bulletins, and programs or songs that are played only once. These different types of objects have different characteristics that for allow identification and segmentation from the media stream. For example radio advertisements on a popular radio station are generally less than 30 seconds in length, and consist of a jingle accompanied by voice. Station jingles are generally 2 to 10 seconds in length and are mostly music and voice and repeat very often throughout the day. Songs on a “popular” music station, as opposed to classical, jazz or alternative, for example, are generally 2 to 7 minutes in length and most often contain voice as well as music.
In general, automatic identification and segmentation of repeating media objects is achieved by comparing portions of the media stream to locate regions or portions within the media stream where media content is being repeated. In a tested embodiment, identification and segmentation of repeating objects is achi
Brown Reuben M.
Lyon & Harr LLP
Microsoft Corporation
Srivastava Vivek
Watson Mark A.
LandOfFree
System and method for identifying and segmenting repeating... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for identifying and segmenting repeating..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for identifying and segmenting repeating... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3217445