Method and system for searching index databases

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06775666

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to computer search mechanisms, and more particularly to computer searching mechanisms that search indexes.
BACKGROUND OF THE INVENTION
Search engines are remote access programs that enable users to search for documents from a body of information (i.e., a database of documents or the Internet). Typically, a search engine searches a database for specific key words and retrieves a list of documents that contain the key words. Search engines can use algorithms to create indexes such that, ideally, only meaningful results are returned for each query. The indexes are arrangements or outlines of topics listed in a rational order.
There are multiple query styles commonly used by search engines. For example, topic-based queries (i.e., “Mexico”), topic/subtopic queries (i.e., “Mexico Cancun”), and Boolean queries (i.e., “Mexico OR Cancun”) are commonly used query styles. Savvy users can build their own Boolean queries and can also use quotation marks to build literal strings with spaces. Search engines ineffectively interpret these query styles because they often result in the retrieval of documents that are too broad for the user's purpose or irrelevant to the user. In addition, the topic/subtopic query can be unpredictable because some search engines will do an “AND” search, and some search engines will do an “OR” search.
There is a need to better search and retrieve relevant documents using the above query styles. In addition, there is a need to better search and retrieve relevant documents using more complicated query styles, such as sentence-based queries (i.e., “I need info on the history of Mexico”), question queries (i.e., “Who is the president of Mexico?”), and essay question queries (i.e., “What are the significant events that led to the formation of Mexico?”). Search engines do not effectively understand these query styles because search engines limit searches by the words literally appearing in the query (i.e., “what, the, that”). Natural language processors (NLPs) have been helpful because they can help identify key words in these types of queries. However, NLPs are not able to prioritize the key words. In addition, if all important key words cannot be matched, there is no thoughtful mechanism for searching a reduced or simpler form of the query.
There is an additional need in the prior art to more effectively search for results for content queries. Requests for a specific type of content (i.e., “I want pictures of Mexico”) require interpreting the query in two ways. First, the topic (i.e., “Mexico”) must be identified. Second, the type of content desired (i.e., “pictures”) must be identified. Content types can include pictures, maps, news magazine articles, and sounds, and can be described in any number of ways by users.
Once a query is understood, there is a need to more effectively search a body of information, often in the form of a database. In the prior art, the body of information may be a full text database consisting of all the target content or a key word database associated with the key words of the target content. Results for searching a full text database and a key word database often produce results that are too numerous, irrelevant, and disorderly to be useful without extensive post-search processing. In addition, searching a key word database is limited by the number of key words that exist and their unstructured nature. To find a match, queries must match a key word literally or match the key words found using NLPs. Users must anticipate the limited set of key words under which the content is listed.
In order to help users search databases, some search engines have allowed users to navigate an outline or hierarchical index to find the specific information they want. Although this option is useful, the outlines and hierarchical indexes have been complicated and have defied current user expectations that they should be able to ask a question and get relevant answers.
In light of the above limitations, there is a need for a search engine that better understands multiple query styles. Once the query is understood, there is a need for a search engine that more effectively searches a body of information. There is also a need for a search engine that presents the matched information in a way that is easily understood by the user, and ranks and sorts the matches according to their relevancy.
SUMMARY OF THE INVENTION
The present invention can solve the above problems by providing a search engine to better match user requests for information. The search engine allows users to search and retrieve information from a body of information, such as a database. It can lead users with general or specific queries to general or specific content in the body of information. Users can be directed to general information, such as the start of a long article, or to specific content within that article. An article outline and related articles can also be navigated. An effective process can search multiple query styles and can find relevant matches. It can analyze the user's query to determine its most important and less-important elements. Users can form their queries in an ad-hoc, free-form manner and still get relevant results. Queries can also be processed in a way that allows for quick results and an efficient use of server resources.
This novel treatment of hierarchical index data can be combined with a NLP to provide more accurate and detailed access to indexed content. For example, the body of information to be searched can be compiled in such a way that searches can be limited to relevant information. User queries can be analyzed in a way that determines the most-important and least-important elements by prioritized clustered tokens. Tokens can consist of a word or multiple words recognized as one entity. The NLP can recognize the important tokens in the query. Clustered tokens can be created by adding a family of related or alternative words and phrases, called word clusters, similar to the token. The clustered tokens can be summarized and combined in a single content catalog of indexes, called a lookup table. Prioritized clustered tokens can be created by prioritizing the clustered tokens according to priority rules that utilize the NLP to identify the importance of key words.
Where matches for all important words of the query cannot be found, less important prioritized clustered tokens can be cut from the query, and the query search can then be repeated using the more important prioritized clustered tokens. The matched information can be ranked and sorted according to relevancy by taking advantage of the knowledge of which prioritized clustered tokens are the most important. A tight feedback loop can enable designers to understand what users want and monitor on-going changes in user information needs.
The present invention can include three main segments: the Index Databases, the Run Time Search Component Object Module (“Search COM”), and the Active Server Page User Interface (“ASP UI”). The Index Databases can include a searchable database containing indexes from a plurality of information sources. The Search COM can be a search component that searches for search terms in the queries. The ASP UI can receive search terms from a user of the computer system.
The Index Databases can include a ContentBuild Database, which collects the various indexes and puts them in a searchable database. There can be numerous indexes or fields in the ContentBuild Database. The ContentBuild Database can include a Full Text Index that is used for performing full text searches. The ContentBuild Database can also include a WordWheel, which is a lookup table. The lookup table consist of rows and columns of data. The lookup table is examined either horizontally or vertically and the data that is sought is retrieved.
The Search COM can include the ResultsList, the Exact Match Search, the NLP, and the Full Text Search. The ResultsList is a results database that can hold all the matches or results from the search. The Exact M

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for searching index databases does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for searching index databases, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for searching index databases will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3356423

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.