Method and system for conducting a full text search on a...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06751624

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to systems and methods for conducting computer based text searching. The present invention relates more specifically to systems and methods for carrying out a text search from a server computer system on data and file information located on a client computer system.
2. Background Information
The Internet comprises a vast number of computers and computer networks that are interconnected through communication links. The interconnected computers exchange information using various services such as electronic mail, Gopher, FTP, and the World Wide Web. All of these technologies require a level of knowledge that is greater than that possessed by the average Internet user who might want to share information. Additionally, the Internet is in a constant state of fluctuation. FTP servers and web sites come and go. A person searching for a particular file using a web browser and any of the popular search engines (Lycos®, Yahoo®, Alta Vista®, etc.) can expect to have mixed to poor results because of such factors as stale links, ratio or account requirements on FTP sites, as well as unknown bandwidth availability on a given site.
Typically when a user wishes to share some of his files, he designs a web site and/or acquires and sets up an FTP server. Either of these tasks requires more expertise than the average Internet user possesses, therefore, much of what could be shared on the Internet is not.
When a user decides to look for a file on the Internet, he will typically use his web browser to contact a search engine. Since major search engines face the daunting task of trying to index every single web page and/or FTP site, the information they return will necessarily be aged and incomplete. Often a search of FTP servers will yield the location of a file and whether the FTP server will be online. The owner of the FTP site, however, will typically have further requirements, such as a user account, or he may require users to upload files before he will allow the user to download anything.
What is needed is an efficient method of creating a dynamic and constantly updated index of that information available on the Internet so that when a person conducts a search and locates information, the person knows that the information is immediately available.
SUMMARY OF THE INVENTION
In view of the above, the present invention is advantageous in that it provides a dynamic and constantly updated searchable index of information that is available on the Internet. To accomplish this, the disclosed invention provides a suffix array search system that allows the rapid searching of large amounts of information from large numbers of users while minimizing the required amount of bandwidth and minimizing the amount of utilized server system resources. The result is that the present invention enables a person searching the Internet to quickly locate and transfer available information.
The present invention may be summarized as a system and method for conducting a full text search on a client system by creating a full text search index of a string of characters on a client system for use on a server system. When a client system signs on to a server system, the client's system searches for relevant data and file information about that data which the user is willing to share and creates an original string of characters that contains file information such as file name, location, and size. The original string of characters is transformed using the Burrows-Wheeler transformation method. In the transformation, a rotation matrix is created of the original string of characters and the last column of the matrix is compressed using a standard compression method before being transmitted to the server system. The server system decompresses the data using the same standard decompression method. The transformation of the file information is reversed to recover the original string of characters. While recovering the original string of characters, a suffix array is created. The original string of characters and suffix array are stored in the memory of the server system. A binary search can be conducted of the suffix array to efficiently locate any sub-string of characters within the original string of characters.
A second client system signing on to the server system can initiate a search of the memory of the server system for a selected sub-string of characters. Once the selected sub-string of characters is found, the server system sends the second client system a list of the located relevant information (filename, location, size, user IP, user port, etc.). If the second user wants to obtain a copy of the data, a message is sent directly between the second client system and the first client system without the server system being involved unless the first client system is behind a firewall. When the first client system is behind a firewall, the request for the file is relayed through the server system. The requested data will then be transferred from the first client system to the second client system.
Each client system willing to share data that is signed on to the server system will have the original string of characters and the suffix array created for the client system and stored in the server system memory only while the client system is signed on to the server system. As soon as the client system signs off the server system, that client system's original string of characters and suffix array are deleted from the server system. This creation of a client system's original string of characters and suffix array only while the client system is signed on the server system enables the server system to have a dynamic and constantly updated index of data, which is available for transfer between client systems.
Other objects and advantages of the present invention will become apparent from the following description of the preferred embodiment with reference to the drawings.


REFERENCES:
patent: 5717393 (1998-02-01), Nakano et al.
patent: 6075470 (2000-06-01), Little et al.
patent: 6119120 (2000-09-01), Miller
patent: 6138129 (2000-10-01), Combs
patent: 6199064 (2001-03-01), Schindler
patent: 6363174 (2002-03-01), Lu
patent: 6493666 (2002-12-01), Wiese, Jr.
patent: 6526401 (2003-02-01), Ito
N. Jesper Larsson, “The Context Trees of Block Sorting Compression,” Proceedings of the IEEE Data Compression Conference, Mar. 1998, pp. 189-198.
Kunihiko Sadakane, “A Fast Algorithm For Making Suffix Arrays And For Burrows-Wheeler Transformation,” Proceedings of IEEE Data Compression Conference, Mar. 1998, pp. 129-138.
Kunihiko Sadakane and Hiroshi Imai, “A Cooperative Distributed Text Database Management Method Unifying Search And Compression Based On The Burrows-Wheeler Transformation,” Advances in Database Technologies, No. 1552 in LNCS, 1999, pp. 434-445.
Ferragina, et al., “An Experimental Study of A Compressed Index”,Proc. 12thACM-SIAM Symposium on Discrete Algorithms(SODA), 2001 [Abstract Only].
Ferragina, et al., “Opportunistic Data Structures With Applications”,Proc. 14thIEEE Symposium on Foundation of Computer Science(FOCS), 2000 [Abstract Only].
Ferragina, et al., “Full-Text Index in Minute Space”,FM Index, 2000.
Nelson, “Data Compression With The Burrows-Wheeler Transform”,Dr. Dobb's Journal, Sep. 1996 [Best Available Copy].
Burrows, et al., “A Block-Sorting Lossless Data Compression Algorithm”,Digital SRC Report, Technical Report 124, May 10, 1994.
Sadakane, “A Modified Burrows-Wheeler Transformation For Case-Insensitive Search With Application To Suffix Array Compression”,Department of Information Science, University of Tokyo. [Abstract & References, Date Unknown].

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for conducting a full text search on a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for conducting a full text search on a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for conducting a full text search on a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3347252

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.