Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-08-24
2001-06-05
Feild, Joseph H (Department: 2176)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C345S215000, C704S231000, C704S235000
Reexamination Certificate
active
06243713
ABSTRACT:
BACKGROUND
1. Field of the Invention
The present invention relates generally to information retrieval systems, and more particularly, to information retrieval systems for retrieval of multimedia information.
2. Background of the Invention
Current computer systems enable users to create complex documents that combine both text and images in an integrated whole. In addition, computer users can now insert digitally recorded audio or video directly into such documents, to create rich multimedia documents. In these documents the image, audio, or video components are either directly embedded into the data of the document at specific positions, or are stored external to the document and referenced with referencing data. An example of the former construction is Rich Text Format (RTF) documents, which embed image data directly into the document. An example of the latter construction are HyperText Markup Language documents which use references to external image, audio, or video files to construct the document, where references have specific locations in the text data. Generally, documents in which two or more different types of multimedia components are embedded or referenced are herein termed “compound documents.”
Separately, both text and image retrieval databases are known, and generally operate independently of each other. Text retrieval systems are designed to index documents based on their text data, and to retrieve text documents in response to text-only queries. Image retrieval systems are designed to index images based either on image characteristics directly (e.g. color, texture, and the like), or textual keywords provided by users which describe or categorize the image (e.g. “sunset”, “blue sky”, “red”), and to retrieve images in response to query containing one or more of these items of data. In particular, the images exist independently in the database from any compound document, and the keyword labels typically form merely another column or data attribute of the image, but do not come from text of a compound document itself. Further, the index of images also exists independently in the database from the text or column indexes. There is no single index that considers the whole of the compound document and all of its multimedia components. For example, a conventional, relational multimedia database might use an image table with columns for image ID, descriptive text string, image data, and category label(s). A user would then request an image by specifying some text keywords or category labels which are processed into a query such as:
SELECT ID
FROM image table
WHERE TEXT LIKE “sunrise”
AND IMAGE LIKE “IMAGE ID FOO”
AND CATEGORY “HISTORY”
Matching on the “image like” operator would then use some type of image data comparison (e.g. matching of color histograms) which is already indexed into the database, along with conventional text matching. However, the result is still merely the retrieval of matching images, not compound documents containing images (let alone other types of multimedia data). An example of an image retrieval system that merely retrieves images in response to image characteristics or text labels is U.S. Pat. No. 5,579,471 issued to IBM for their “QBIC” image retrieval system.
Another limitation of conventional systems is that they do not expand a user's query with multiple different types of multimedia data which is then subsequently to retrieve matching documents. For example, current systems do not take a user's text query, add image data (or portions thereof, e.g. a color histogram) to the query, and then search for documents, including text and images, that satisfy the expanded query.
Accordingly, it is desirable to provide a system, method, and software product that retrieves compound documents in response to queries that include various multimedia elements in a structured form, including text, image features, audio, or video.
SUMMARY OF THE INVENTION
The present invention overcomes the limitations of conventional text or retrieval systems by providing a completely integrated approach to multimedia document retrieval. The present invention indexes compound documents, including multimedia components such as text, images, audio, or video components, into a unified, common index, and then receives and processes compound queries that contain any such multimedia components against the index to retrieve compound documents that satisfy the query.
More particularly, the present invention takes advantage of the fact that compound documents are structured in that the multimedia components have specific positions within the document. For example,
FIG. 1
a
illustrates a sample compound document
100
, having text components
101
a
, an image component
101
b
, an audio component
101
c
, and a video component
101
d
.
FIG. 1
b
illustrates that these components have specifically defined positions within the actual data of the document, such as character or byte offsets from the beginning of the document. The image
101
a
of the sunset, for example, is at the 244
th
character, the audio recording
101
b
at the 329
th
character, and the video recording
101
d
at the 436
th
character. The words of the document obviously also have character offsets. Using this position information, the present invention constructs and maintains a unified multimedia index, which indexes the compound documents and their multimedia components. In particular, each multimedia component is indexed by associating it with each document that contains the component, specifying its position within the document, along with data descriptive of, or desired from, its component content. For example, assume a document contains an image at the 90
th
character position and an recorded audio data at the 100
th
character position. The image would be indexed to reflect its position within the document at the 90
th
character, along with, for example, color histogram data descriptive of color, texture map data descriptive of texture, edge data descriptive of edges or lines, and luminance data descriptive of image intensity. Alternatively, each of these elements of the image may be separately stored in the multimedia index, each with data identifying the document and the position of the image in the document. The audio data would be indexed by speech recognition of words or phonemes, each of which is indexed to reflect the audio's position at the 100
th
character, and further optionally indexed to reflect their relative time offset in the recorded audio. Thus, a single compound document can be indexed with respect to any number of multimedia components (or portions thereof), with the multimedia index reflecting the position of the multimedia component or its portions within the document.
With this multimedia index, the present invention can process true compound queries that include various types of multimedia components, and thus retrieve compound documents that best satisfy the query overall, and not merely satisfy a text query or an image query, as in conventional systems. More particularly, a compound query can include text, image, audio, or video components, and search operators that define logical relationships (e.g. AND, OR, NOT), or proximity relationships (e.g. “NEAR”, “within n”) between the components. For example, a compound query may require that the word “beach” appear within 10 words of an image of a sunset. The present invention uses the position information in the multimedia index, along with the indexed data (e.g. color data) to find compound documents that have the desired text and have an image with image characteristics (e.g. color, texture, luminance) which match the input query image within the defined proximity relationship.
In one embodiment, the present invention provides a software product which performs the indexing and retrieval tasks, and which stores the multimedia index. The indexing modules of the software product preferably provide a multistage process in which the various multimedia components of a compound document are identified as to their type
Anderson Christopher H.
David Mark R.
Gardner Paul C.
Nelson Paul E.
Whitman Ronald M.
Excalibur Technologies Corp.
Feild Joseph H
Fenwick & West LLP
LandOfFree
Multimedia document retrieval by application of multimedia... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multimedia document retrieval by application of multimedia..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multimedia document retrieval by application of multimedia... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2514138