Document retrieval having retrieval conditions that shuffles...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C709S201000

Reexamination Certificate

active

06424963

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a document retrieval method for sequencing documents according to the goodness of fit to the retrieval condition, and issuing the retrieval results according to this sequence, a recording medium in which its program is recorded, and a document retrieval apparatus, and more particularly to a document retrieval method capable of judging the relation between the retrieval condition and retrieval result easily, a recording medium in which its program is recorded, and a document retrieval apparatus.
BACKGROUND OF THE INVENTION
Recently, as a huge quantity of electronic document information has begun to circulate, such as electronic mail, electronic catalog and electronic publication, there is a mounting interest about document retrieval method and document retrieval apparatus capable of retrieving only a desired document among electronic document information.
As the document retrieval method and document retrieval apparatus for retrieving only a desired document, much has been proposed so far about the technique of document retrieval by sequencing the results of retrieval by making use of information of frequency of occurrence of characters or symbols (or words as called hereinafter). In a conventional document retrieval method by making use of information of occurrence of words, the evaluation value is set higher in words occurring often in a certain document, and the evaluation value is set lower in words not occurring in other documents, and the documents are sequence according to such index.
For example, in a conventional document retrieval method, as a standard index for calculating the word evaluation value ev, the following formula is used.
ev
=log(N/
df
)  [Formula 1]
where N is the total number of documents, and df is the number of documents in which the word of notice (the word to be retrieved or retrieval word) occurs.
In this case, for example, if the total number of documents N is 1000, and the number of documents having the retrieval word X is 10, the evaluation value evx of the retrieval word X is evx=log (1000/10)=2.0, and if the number of documents having the retrieval word Y is 100, the evaluation value evy of the retrieval word Y is evy=log (1000/100)=1.0.
The evaluation value E of each document is, for a set of all retrieval words, the sum of the product of an evaluation value e of a certain retrieval word and the frequency of the retrieval word in the document (frequency of occurrence). That is, supposing the frequency of occurrence of a certain retrieval word in document to be tf, the evaluation value Ev of the retrieval word in document is expressed in the following formula.
Ev=&Sgr;{tf×ev}=&Sgr;{tf
×log(N/
df
)}  [Formula 2]
For example, evaluation values EvA and EvB in document A and document B about retrieval word X and retrieval word Y are calculated as follows. First, the frequency of occurrence tf of retrieval word X and retrieval word Y in document A and document B is determined. Herein, in document A, the frequencies of occurrence tfAX and tfAY of retrieval word X and retrieval word Y are respectively tfAX=10 and tfAY=5, and in document B, the frequencies of occurrence tfBX and tfBY of retrieval word X and retrieval word Y are respectively tfBX=5 and tfBY=10. In this case, from formula 2, the evaluation value EA of document A and evaluation value EB of document B are calculated as follows respectively.
EvA=10×2.0+5×1.0=25.0
EvB=5×2.0+10×1.0=20.0  [Formula 3]
Thus, in the conventional document retrieval method, mostly, the word occurring in the retrieval condition is used as the word of notice (retrieval word) when calculating the evaluation value Ev of document. That is, according to the conventional document retrieval method, the retrieval results of documents are sequenced on the basis of the evaluation value Ev of each document obtained in this manner.
However, in the conventional document retrieval method, since the document retrieval results are sequenced by integrating the information about the frequency of occurrence of retrieval word in the retrieval condition, it is hard to distinguish the individual effects of each retrieval word in the document retrieval results.
In particular, if the retrieval result conforming to the purpose of retrieval is not obtained, it is necessary to retrieve again by revising the retrieval condition (retrieval word, etc.). At this time, it was hard to understand for the user how the effect of such revision is utilized in the sequencing of the retrieval results.
SUMMARY OF THE INVENTION
The invention is devised in the light of the above background, and it is hence an object thereof to present a document retrieval method allowing the user to judge easily the validity of retrieval condition such as retrieval word and effects of retrieval condition on the retrieval result, so that the user can improve the efficiency of retrieval process, and a recording medium in which its program is recorded, and a document retrieval apparatus.
To solve the problems, a first aspect of a document retrieval method of the invention is a document retrieval method for retrieving a set of documents composed of plural documents according to an entered retrieval condition, comprising the steps of retrieving each document included in the set of retrieval object documents according to the entered retrieval condition, sequencing each document depending on the goodness of fit to the retrieval condition, and acquiring the retrieval result by shuffling the documents in the sequence by occurrence, designating a specific set of sample documents and a specific sample document included in the specific set of sample documents, detecting the sequence by occurrence in the retrieval result in each designated specific sample document according to the retrieval result, and calculating the occurrence distribution of the retrieval condition relating to the set of sample documents including the specific sample document according to the sequence by occurrence in each specific sample document.
To solve the problems, a second aspect of a document retrieval method of the invention comprises the steps of retrieving each document included in the set of retrieval object documents according to the entered retrieval condition, sequencing each document depending on the goodness of fit to the retrieval condition, and acquiring the retrieval result by shuffling the documents in the sequence by occurrence, generating a table of sets of sample documents designating the relation of a specific set of sample documents and a specific sample document according to the acquired retrieval result, detecting the sequence by occurrence in the retrieval result in each specific sample document designated in the table of sets of sample documents according to the retrieval result, and calculating the occurrence distribution of the retrieval condition relating to the set of sample documents including the specific sample document according to the sequence by occurrence in each specific sample document.
To solve the problems, a third aspect of a document retrieval method of the invention comprises the steps of retrieving each document included in the set of retrieval object documents according to the entered retrieval condition, sequencing each document depending on the goodness of fit to the retrieval condition, and acquiring the retrieval result by shuffling the documents in the sequence by occurrence, subdividing the entered retrieval condition, and generating a divided retrieval condition by arbitrarily combining the retrieval conditions in the subdivided units, designating a specific set of sample documents and a specific sample document included in the specific set of sample documents according to the divided retrieval condition and the retrieval result, detecting the sequence by occurrence in the retrie

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Document retrieval having retrieval conditions that shuffles... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Document retrieval having retrieval conditions that shuffles..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document retrieval having retrieval conditions that shuffles... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2901221

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.