Signal processing method and video/voice processing device

Image analysis – Pattern recognition – Feature extraction

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S173000, C382S305000

Reexamination Certificate

active

06744922

ABSTRACT:

TECHNICAL FIELD
The present invention relates to a signal processing method for detecting and analyzing a pattern reflecting a semantics on which a signal is based, and a video signal processor for detecting and analyzing a visual and/or audio pattern reflecting a semantics on which a video signal is based.
BACKGROUND ART
It is often desired to search, for playback, a desired part of a video application composed of a large amount of different video data, such as a television program recorded in a video recorder, for example.
As a typical one of the image extraction techniques to extract a desired visual content, there has been proposed a story board which is a panel formed from a sequence of images defining a main scene in a video application. Namely, a story board is prepared by decomposing a video data into so-called shots and displaying representative images of the respective shots. Most of the image extraction techniques are to automatically detect and extract shots from a video data as disclosed in “G. Ahanger and T. D. C. Little: A Survey of Technologies for Parsing and Indexing Digital Video, Journal of Visual Communication and Image Representation 7: 28-4, 1996”, for example.
It should be noted that a typical half-hour TV program for example contains hundreds of shots. Therefore, with the above conventional image extraction technique of G. Ahanger and T. D. C. Little, the user has to examine a story board having listed therein enormous shots having been extracted. Understanding of such a story board will be a great burden to the user. Also, a dialogue scene in which for example two persons are talking will be considered here. In the dialogue, the two persons are alternately shot by a camera each time either of them speaks. Therefore, many of such shots extracted by the conventional image extraction technique are redundant. The shots contain many useless information since they are at too low level as objects from which a video structure is to be extracted. Thus, the conventional image extraction technique cannot be said to be convenient for extraction of such shots by the user.
In addition to the above, further image extraction techniques have been proposed as disclosed in “A. Merlino, D. Morey and M. Maybury: Broadcast News Navigation Using Story Segmentation, Proceeding of ACM Multimedia 97, 1997” and the Japanese Unexamined Patent Publication No. 10-136297, for example. However, these techniques can only be used with very professional knowledge of limited genres of contents such as news and football game. These conventional image extraction techniques can assure a good result when directed for such limited genres but will be of no use for other than the limited genres. Such limitation of the techniques to special genres makes it difficult for the technique to easily prevail widely.
Further, there has been proposed a still another image extraction technique as disclosed in the U.S. Pat. No. 5,708,767 for example. It is to extract a so-called story unit. However, this conventional image extraction technique is not any completely automated one and thus a user's intervention is required to determine which shots have the same content. Also this technique needs a complicated computation for signal processing and is only applicable to video information.
Furthermore, a still another image extraction technique has been proposed as in the Japanese Unexamined Patent Publication No. 9-214879, for example, in which shots are identified by a combination of shot detection and silent period detection. However, this conventional technique can be used only when the silent period corresponds with a boundary between shots.
Moreover, a yet another image extraction technique has been proposed as disclosed in “H. Aoki, S. Shimotsuji and O. Hori: A Shot Classification Method to Select Effective Key-Frames for Video Browsing, IPSJ Human Interface SIG Notes, 7:43-50, 1996” and the Japanese Unexamined Patent Publication No. 9-93588 for example, in which repeated similar shots are detected to reduce the redundancy of the depiction in a story board. However, this conventional image extraction technique is only applicable to visual information, not to audio information.
Further, the conventional image extraction techniques can only detect a so-called local video structure and a general video structure which is based on a special knowledge.
DISCLOSURE OF THE INVENTION
Accordingly, the present invention has an object to overcome the above-mentioned drawbacks of the prior art by providing a signal processing method and video signal processor, which can extract a high-level video structure in a variety of video data.
The above object can be attained by providing a signal processing method for detecting and analyzing a pattern reflecting the semantics of the content of a signal, the method including, according to the present invention, steps of: extracting, from a segment consisting of a sequence of consecutive frames forming together the signal, at least one feature which characterizes the properties of the segment; calculating, using the extracted feature, a criterion for measurement of a similarity between a pair of segments for every extracted feature and measuring a similarity between a pair of segments according to the similarity measurement criterion; and detecting, using the feature and similarity determination criterion, a similarity chain consisting of two or more, similar to each other, of the segments.
In the above signal processing method according to the present invention, a basic structure pattern of similar segments in the signal are detected.
Also the above object can be attained by providing a video signal processor for detecting and analyzing a visual and/or audio pattern reflecting the semantics of the content of a supplied video signal, the apparatus including according to the present invention: means for extracting, from a visual and/or audio segment consisting of a sequence of consecutive visual and/or audio frames forming together the video signal, at least one feature which characterizes the properties of the visual and/or audio segment; means for calculating, using the extracted feature, a criterion for measurement of a similarity between a pair of visual segments and/or audio segments for every extracted feature and measuring a similarity between a pair of visual segments and/or audio segments according to the similarity measurement criterion; and means for detecting, using the feature and similarity determination criterion, a similarity chain consisting of two or more, similar to each other, of the visual and/or audio segments.
In the above video signal processor according to the present invention, a basic structure pattern of similar visual and/or audio segments in the video signal are detected.


REFERENCES:
patent: 5664227 (1997-09-01), Mauldin et al.
patent: 5751377 (1998-05-01), Kadono
patent: 5821945 (1998-10-01), Yeo et al.
patent: 6278446 (2001-08-01), Liou et al.
patent: 0 711 078 (1996-05-01), None
patent: 2-59976 (1990-02-01), None
patent: 7-193748 (1995-07-01), None
patent: 8-181995 (1996-07-01), None
patent: 10-257436 (1998-09-01), None
Yeung, et al. “Extracting story units from long programs for video browing and navigation”, IEEE, pp. 296-305, 1996.*
Taskiran, et al. “A compressed video database structured for active browsing and search”, IEEE, pp. 133-137, 1998.*
Yeung, et al. “Time-constrained clustering for segmentation of video into story units”, IEEE, pp. 375-380, 1996.*
Yoshitaka, et al. “Content-based retrieval of video data by the grammar of film”, IEEE, pp. 310-317, 1997.*
Rui, et al. “Exploring video structure beyond the shots”, IEEE, pp. 1-4, 1998.*
Aoki, et al. “A shot classification method of selecting effective key-frames for video browsing”, ACM, pp. 1-10, 1996.*
Merlino, et al. “Broadcast news navigation using story segmentation”, ACM, pp. 381-391, 1997.*
Vasconcelos, et al. Bayesian modeling of video editing and structure: semantic features for video summarization and browsing IEEE, pp. 153-157, 1998.*
Nakamura, et al. “Semantic analysi

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Signal processing method and video/voice processing device does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Signal processing method and video/voice processing device, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Signal processing method and video/voice processing device will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3359611

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.