Document processing system and recording medium

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06523025

ABSTRACT:

BACKGROUND OF THE INVENTION
(1) Field of the Invention
The present invention relates to a document processing system for storing input documents after subjecting the documents to a predetermined process and for retrieving or clipping documents matching a given query from the stored documents, and to a recording medium recording a program for causing a computer to perform such processes.
(2) Description of the Related Art
With recent popularization of the Internet and an increasing number of full-text databases, information available to individuals is drastically expanding.
To acquire desired information from among such a vast amount of information, a method is generally adopted in which a retrieval process, clipping process or the like is performed using, as a key, search terms (query) describing features of data to be obtained, for example.
With conventional large-scale commercial on-line databases or full-text retrieval systems, however, if the condition of search terms is loosened, noise (unneeded data) included in the search results increases; conversely, if the search condition is narrowed, search omission may result, giving rise to a problem that it is difficult for the user to acquire desired data.
Specifically, in a document culling or narrowing process or a document retrieval process adopted in conventional document filtering, ranking retrieval based on the degree of coincidence or relevancy between the query and document contents is conducted at best, and accordingly, it is difficult to carry out document culling that fully reflects the importance of information included in documents or the user's purpose of performing search.
Consequently, even in the case where the user desires to search for an organization named “Hashimoto”, for example, documents including “Hashimoto” as a name of place are very often retrieved.
Also, when new products priced in the 200000 to 299999 yen range are to be searched for, it is necessary to use a query which is created taking account of every possibility like “two hundred thousand yen”, “200,000 yen”, “two hundred ten thousand yen” and “two hundred fifty thousand yen”.
Further, although it is possible to search for documents by specifying a document creation date, date information included in documents cannot be utilized for search.
In the following sentences, for example, “the 1st” means different days, though the words used are the same.
(a) On the 1st, Corporation A will release Product B.
(b) On the 1st, Corporation A released Product B.
If the sentences were created on Feb. 15, 1997, “1st” means Mar. 1, 1997 in the case of (a), and means Feb. 1, 1997 in the case of (b).
The conventional method is thus associated with a problem that it is difficult to recognize the attributes of date information in documents and to use (utilize) such information for search.
SUMMARY OF THE INVENTION
The present invention was created in view of the above circumstances, and an object thereof is to provide a document processing system capable of performing document retrieval or document culling that fully reflects the user's purpose of performing search.
It is another object of the present invention to provide a recording medium recording a document processing program for performing a document retrieval process or clipping process that fully reflects the user's purpose of performing search.
FIG. 1
illustrates the principles of the present invention for achieving the above objects. The present invention provides a document processing system for storing input documents after subjecting the documents to a predetermined process and for retrieving or clipping documents matching a given query from the stored documents, the system comprising knowledge information storing means
3
, event specifying means
4
, attribute value extracting means
5
, correlating means
10
, document storing means
11
, and document extracting means
12
.
The knowledge information storing means
3
stores knowledge information necessary for processing an input document. The event specifying means
4
specifies the type of an event described in the input document by looking up the knowledge information stored in the knowledge information storing means
3
. The attribute value extracting means
5
extracts, from the input document, attribute values of attributes relating to the event specified by the event specifying means
4
by looking up the knowledge information stored in the knowledge information storing means
3
. The correlating means
10
correlates the attribute values extracted by the attribute value extracting means
5
with entities in the real world by looking up the knowledge information stored in the knowledge information storing means
3
. The document storing means
11
stores the attribute values correlated by the correlating means
10
and the input document or information specifying a storage location thereof in a manner associated with each other. The document extracting means
12
looks up the attribute values and a query to retrieve or clip target documents.
The knowledge information storing means
3
stores events, attributes relating thereto, and information for extracting attribute values constituting the attributes, in a manner associated with one another. The event specifying means
4
collates an input document with the knowledge information stored in the knowledge information storing means
3
, to thereby specify an event described in the document. The attribute value extracting means
5
refers to the knowledge information storing means
3
and extracts attribute values of attributes relating to the specified event from the document. The correlating means
10
correlates the extracted attribute values with entities in the real world into one-to-one correspondence by looking up the knowledge information stored in the knowledge information storing means
3
. The document storing means
11
stores the thus-correlated attribute values and the document or information specifying a storage location thereof in a manner associated with each other. The document extracting means
12
collates information included in an input query with the attribute values stored in the document storing means
11
, to extract desired documents.
Thus, the contents of documents are grasped in terms of event, and information generated by extracting attribute values of attributes constituting the grasped event and correlating the extracted attribute values with entities in the real world is looked up to retrieve or clip documents, whereby the retrieval or clipping accuracy can be improved.


REFERENCES:
patent: 5953718 (1999-09-01), Wical
patent: 5957520 (1999-09-01), Suda et al.
patent: 5963940 (1999-10-01), Liddy et al.
patent: 5999925 (1999-12-01), Evans
patent: 6026388 (2000-02-01), Liddy et al.
patent: 6243723 (2001-06-01), Ikeda et al.
patent: 6263335 (2001-07-01), Paik et al.
Tatsuo Kamio, Automated Indexing for Making of a Newspaper Article Database, Information and Documentation, vol. 32, No. 4, English abstract.
Akitoshi Okumura, et al., “Information Sharing Platform Based on 5W1H Clustering and Navigation”, Joho Shori Gakkai Kenkyu Houkoku, (97-DD-9-1), English abstract.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Document processing system and recording medium does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Document processing system and recording medium, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document processing system and recording medium will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3169892

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.