Classification of retrievable documents according to types...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06505195

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method of and an apparatus for retrieving documents matching an indicated condition from a large number of documents.
2. Description of the Related Art
According to one conventional document retrieval process, documents that contain all or some of entered keywords are retrieved from a large number of documents. This document retrieval process is provided as services for retrieving various documents that are available in the Internet or personal computer communication services, and also as software for retrieving documents stored in a hard disk. However, entering a keyword or keywords to indicate a retrieving condition is not effective enough to narrow a large number of documents down to only those documents which the user wants to have, and is disadvantageous in that the retrieved documents tend to include many documents which match the condition but do not meet the user's needs. Although some services for retrieving various documents that are available in the Internet allow the user to add a keyword or keywords to further narrow down the retrieved documents, they fail to completely eliminate unwanted documents.
To solve the above problems, there have been proposed processes for classifying retrieved documents according to other factors than keywords and presenting classified documents to the user. For example, Japanese laid-open patent publications Nos. 8-235160 and 9-231238 disclose processes for classifying retrieved documents.
Specifically, Japanese laid-open patent publication No. 8-235160 discloses a method of and an apparatus for retrieving documents. According to the disclosed method and apparatus, if the number of retrieved documents is greater than a preset value, the retrieved documents are classified according to attribute data such as document names, document registration dates, etc. assigned to the documents, and the classified documents are presented to the user.
Japanese laid-open patent publication No. 9-231238 discloses a method of and an apparatus for displaying retrieved texts. According to the disclosed method and apparatus, the subjects of retrieved texts are analyzed and divided into a plurality of groups, so that the texts are classified and displayed.
A process for classifying a plurality of documents, disclosed in Japanese laid-open patent publication No. 10-320411, extracts keywords with 5W1H attributes from documents, and classifies the documents into a two-dimensional matrix with the extracted keywords with 5W1H attributes.
However, the above document retrieving processes often fail to narrow documents down to suitable documents for the user or to provide suitably classified documents.
For example, it is assumed that the user who wishes to stay in “X hotel” tries to retrieve documents containing a keyword “X hotel” in order to obtain information necessary to stay in “X hotel”. The information required by the user includes the contact information of “X hotel” and the address of “X hotel”, and the documents which are required by the user are documents containing the required information. However, only the condition that the keyword “X hotel” be included in documents is not specific enough to narrow a large number of documents down to only those documents which contain the contact information of “X hotel” and the address of “X hotel”. For example, documents retrieved under the above condition may include a document containing a news reporting that a new product has been presented in the X hotel and a Web document resembling a diary which states that someone enjoyed a dinner at a restaurant in the X hotel, though these documents are not required by the user. Since the condition that the contact information and the address be included in documents cannot be expressed by keywords, it is impossible to limit retrieved documents and exclude unwanted documents by adding a keyword or keywords.
With the method of and the apparatus for retrieving documents disclosed in Japanese laid-open patent publication No. 8-235160, retrieved documents can be classified according to attributes assigned to the documents. Therefore, attributes necessary to classify documents need to be assigned to the documents in advance. Unless information about the contact information and the address is recorded as attributes of documents, the retrieved documents cannot be classified into documents with the contact information and the address and documents without the contact information and the address. In particular, it is difficult for the disclosed system to deal with Web documents available in the Internet.
According to the disclosed method and apparatus of Japanese laid-open patent publication No. 9-231238, the retrieved texts are classified according to their subjects into those texts with the subjects containing information as to the contact information and the address and those texts with the subjects containing no information as to the contact information and the address. However, some texts with the subjects containing no information as to the contact information and the address may contain information as to the contact information and the address in their bodies. For example, a news reporting that the X hotel has added a new annex in its subject may possibly contain information as to the contact information and the address in its body. Therefore, the disclosed classification principle may not necessarily be effective to classify retrieved documents into those required by the user and those not required by the user.
An apparatus for and a method of classifying documents and a recording medium which stores a program for classifying documents, as disclosed in Japanese laid-open patent publication No. 10-320411, are capable of classifying documents with keywords with 5W1H attributes extracted from the documents. However, the type of 5W1H as a key for classification needs to be indicated by the user each time documents are to be classified. Furthermore, since documents are classified according to the unit of 5W1H, they cannot be classified according to smaller units including address, nearby station, telephone number, and e-mail address.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a method of and an apparatus for easily retrieving documents that are required by the user.
A document retrieval apparatus according to a first aspect of the present invention classifies retrieved documents based on whether documents contain attribute elements representing specific contents related to certain attributes (concepts), and classifies documents containing attribute elements related to the certain attributes according to types of the certain attributes. The attribute elements represent elements which specifically indicate the contents of certain attributes, such as address, telephone number, nearby station, price, date, time, e-mail address, URL, company name, product name, type number, in the documents. For example, an attribute element representing an attribute of address is “Chiyoda ward, Tokyo metropolis”, and an attribute element representing an attribute of price is “12,000 yen”.
Specifically, the document retrieval apparatus has a classification attribute storage storing only types of indicated attributes, among a plurality of types of attributes that can be used to classify documents, an attribute analyzing means for analyzing each of the retrieved documents to determine whether an attribute element belonging to the types of attributes stored in the classification attribute storage is contained in the document or not, and an attribute classifying means for classifying each of the retrieved documents such that documents containing the same type of attribute elements fall in the same category and documents containing no attribute elements fall in an independent category.
The attribute classifying means analyzes each of the retrieved documents, and sends information indicating which one of the types of attributes stored in the classification attribute sto

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Classification of retrievable documents according to types... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Classification of retrievable documents according to types..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Classification of retrievable documents according to types... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3040165

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.