Full-text index producing device for producing a full-text...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Full-text index producing device for producing a full-text... Full-text index producing device for producing a full-text...

: 1999-03-02
: 2001-02-13
: Black, Thomas G. (Department: 2771)
: Data processing: database and file management or data structures
: Database design
: Data structure types

: C707S793000, C382S177000
: Reexamination Certificate
: active
: 06189006
: ABSTRACT:

BACKGROUND OF THE INVENTION
This invention relates to a full-text data base retrieving device for memorizing a plurality of texts (character code sequences) as a full-text data base to retrieve a text from the full-text data base on the basis of a retrieving condition such as key words, and relates to a full-text index producing device for producing a complementary file (full-text index) which is used in retrieval.
In order to retrieve a text from a full-text data base at a high speed, a complementary file (full-text index) is produced in concern with the full-text data base to be referred on retrieving the text from the full-text data base. In general, the full-text index has any one of first though fifth types. The full-text indexes of the first through the fifth types may be called first through fifth type indexes, respectively.
A single word is used as a key in the the first type index. A character sequence having a predetermined length is used as the key in the second type index. The character sequence having a same character sort is used as the key in the third type index. The single word and the character sequence are used as the key in the fourth type index. The single word and the character sequence are used as the key in the fifth type index. A full-text index file may have a combination of any one of the first through the third type indexes and any one of the fourth and the fifth type indexes.
In a text retrieval of English text, use is often made of a full-text index file having the first and the fourth type indexes or the first and the fifth type indexes. The full-text index file of the type described will be called a first full-text index file. Each word is punctuated with a space in English text. On the other hand, it is necessary to divide a solid writing text into each word with reference to a word dictionary in order to produce the first full-text index file in a text retrieval of Japanese text. This process will be called a morphological analysis. A full-text data base retrieving device having the first full-text index file will be called a first full-text data base retrieving device. In the first full-text data base retrieving device, a word which is at least partially coincident with a key word is retrieved from a key group of the first full-text index file when the key word is given as a query. When a coincident key (word) exists in the first full-text index file, the full-text data base retrieving device reads the text ID or the location in the text as a retrieval result from the first full-text index file.
A full-text index file having the second and the fourth type indexes will be called a second full-text index file. A full-text data base retrieving device having the second full-text index file will be called a second full-text data base retrieving device. In the second full-text data base retrieving device, a character sequence of a key word is divided when the key word is given as a query. It will be assumed that the key word is “
” and that the predetermined length is equal to one. The second full-text data base retrieving device divides “
” into “
”, “
”, and “
” and each of which is a key character. The second full-text data base retrieving device retrieves the second full-text index file on the basis of the each key character to obtain a set of texts each of which has “
”, a set of texts each of which has “
”, and a set of texts each of which has “
”. On the basis of these sets, the second full-text data base retrieving device obtains a set of texts each of which has “
”, “
”, and “
”.
It will be assumed that the key word is “
” and that the predetermined length is equal to two. The second full-text data base retrieving device divides “
” into “
” and “
” each of which is a key character. The second full-text data base retrieving device retrieves the second full-text index file on the basis of the each key character to obtain a set of texts each of which has “
” and a set of texts each of which has “
”. On the basis of these sets, the second full-text data base retrieving device obtains a set of texts each of which has “
” and “
”. The set of texts may includes a rubbish. More specifically, three characters may not be arranged in order of “
” even if three characters of “
”, “
”, and “
” are included in a text. For example, the text including the character sequence of “ . . .
. . . ” becomes the rubbish. In order to remove the rubbish, it is necessary to carry out character string watching between the text and the key word in concern to the text of the retrieval result.
A full-text index file having the second and the fifth type indexes will be called a third full-text index file. A full-text data base retrieving device having the third full-text index file will be called a third full-text data base retrieving device. In the third full-text data base retrieving device, a character sequence of a key word is divided when the key word is given as a query. It will be assumed that the key word is “
” and that the predetermined length is equal to one. The third full-text data base retrieving device divides “
” into “
”, “
”, and “
” each of which is a key character. The third full-text data base retrieving device retrieves the third full-text index file on the basis of the each key character to obtain a set of text ID and location in the text which has “
”, a set of text ID and location in the text which has “
”, and a set of text ID and location in the text which has “
”. The third full data base retrieving device combines the elements of these sets to obtain a location at which three characters of “
”, “
”, and “
” appears as a character sequence of “
” in a same text.
It will be assumed that the key word is “
” and that the predetermined length is equal to two. The third full-text data base retrieving device divides “
” into “
” and “
” each of which is a key character. The third full-text data base retrieving device judges the location at which the character sequence of “
” as a similar manner described above. The rubbish does not occur in the third full-text data base retrieving device.
A full-text index file having the third and the fourth type indexes will be called a fourth full-text index file. The fourth full-text index file uses, as a key character sequence, a character sequence obtained by dividing a text by a same sort of characters such as Chinese character, Japanese cursive syllabary, and square Japanese syllabary. A full-text data base retrieving device having the fourth full-text index file will be called a fourth full-text data base retrieving device.
It will be assumed that the text is “

”. Each of “
”, “
”, “
”, “
”, “
”, and “
” becomes the key character sequence. The key word of the query is divided in a similar manner described above. The fourth full-text data base retrieving device retrieves the fourth index file on the basis of the key word. For example, the key word is divided into “
” and “
” when the key word is “
”. The fourth full-text data base retrieving device retrieves the fourth index file to obtain a text including “
” and “
”.
A full-text index file having the third and the fifth type indexes may be called a fifth full-text index file. A full-text data base retrieving device having the fifth full-text index file will be called a fifth full-text data base retrieving device. The fifth full-text data base retrieving device is operable in a manner similar to the fourth full-text data base retrieving device.
By the way, the first full-text data base retrieving device must use the above-mentioned morphological analysis on producing the first full-text index file in case of Japanese text. In this analysis, it is necessary to divide each text into words with reference to a word dictionary having a hundred thousand through a several hundred thousand words. Therefore, it takes a long time to produce the first full-text index file. Furthermore, it is a case that some texts have a word which is not included in the word dictionary. As a result, it is difficult to analyze all of texts with a high accuracy. Namely, it is d

Affiliated with

Fukushima Toshikazu

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Black Thomas G.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

NEC Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Rones Charles L.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Sughrue Mion Zinn Macpeak & Seas, PLLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Full-text index producing device for producing a full-text... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Full-text index producing device for producing a full-text..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Full-text index producing device for producing a full-text... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2571553

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure