Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-10-22
2001-07-17
Black, Thomas (Department: 2171)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C704S009000
Reexamination Certificate
active
06263333
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates to data processing systems and, more particularly, to techniques for searching text strings for matches against a database of keywords to facilitate a search and retrieval mechanism.
2. Description of the Related Art
It is known in the art to provide computer-assisted diagnostic tools to assist end users in identifying and solving computer problems. Adaptive Learning (ADL) is one such diagnostic tool that provides a natural language interface for searching a database comprising active solutions to particular user problems. ADL has been implemented in known commercial products, such as the Tivoli Service Desk Version 5.0 Expert Advisor. ADL accepts unstructured textual descriptions and searches the descriptions for user-defined keywords. Each keyword is associated with a concept, and several keywords may be associated with a single concept. Thus, for example, the keywords crash, lock and freeze may have the single concept crash representing them. ADL uses the keywords and concepts to search a knowledge base for solutions related to a user's problem description. The solutions are then listed with a score indicating their relationship to a current problem.
In earlier ADL versions, these natural language descriptions were broken down into discrete words based on space delimitation. Each word was then compared for matches to a list of user-defined keywords. This ADL algorithm was not sufficient for use in an International application for several reasons. First, because many non-English languages do not use space delimitation in their writing systems, it was not possible to break down the natural language description into discrete words. Moreover, the techniques used in such prior versions for matching text against user-defined keywords did not operate against a full range of non-English characters.
There remains a need to provide new and improved adaptive learning methods and systems that address these and other deficiencies of the prior art.
BRIEF SUMMARY OF THE INVENTION
It is a general object of the present invention to provide an adaptive learning system for searching and retrieving solutions to user problems.
It is another object of this invention to provide an internationalized search mechanism for matching keywords in a user problem description to valid keywords stored in a dictionary table indexed by Unicode characters.
It is a further object of the present invention to efficiently compare a small, free-form problem description with a large number of keywords to determine whether any of the keywords exist in the short text string.
It is still another important object of the invention to take a non-tokenized text string (namely, a string that does not include spacing between words) and to analyze the string against keywords organized in a data structure, preferably a structure indexed by Unicode characters.
A more general object of this invention is to provide search and retrieval of previously recorded solutions to user problems. This information is used when new problem descriptions are entered. The description of the problem is analyzed for isolated keywords that have appeared in previous solutions. The solutions that have the most keywords in common with the description are then returned as potential solutions to the new problem.
It is still another object of this invention to provide a methodology for ranking a set of problem solutions identified using the above-described search and retrieval strategy.
Still another object of this invention it to provide a very fast and efficient internationalized pattern matching algorithm for an adaptive learning diagnostic tool.
A more specific object of this invention is to provide an optimal solution to searching non-tokenized text for matches.
These and other objects of the invention are provided in an adaptive learning system and method. This method begins when a problem description provided by the user is received. This problem description may include non-tokenized text. The description is then searched character-by-character against a unique keyword data structure for any user-defined keywords. During this matching process, the routine examines each character in the description and compares it to the keywords in the data structure. Once all keywords are identified, the routine generates a set of solutions associated with at least one of the matching keywords. These solutions are then ranked, for example, based on how many times a respective solution has been used (to solve the problem previously) or how may matching keywords are associated with a respective solution.
In a preferred embodiment, the matching process searches a non-tokenized text string for matches against a keyword data structure organized as a set of one or more keyword objects. The routine begins by (a) indexing into the keyword data structure using a character in the non-tokenized text string. Preferably, the character is a Unicode value. The routine then continues by (b) comparing a portion of the non-tokenized text string to a keyword object. If the portion of the non-tokenized text string matches the keyword object, the routine saves the keyword object in a match list. If, however, the portion of the non-tokenized text string does not match the keyword object and there are no other keyword objects that share a root with the non-matched keyword object, the routine repeats step (a) with a new character. These steps are then repeated until all characters in the non-tokenized text string have been analyzed against the keyword data structure.
If the portion of the non-tokenized text string matches the keyword object and there is a second keyword object whose root is the keyword object matched, the method removes those characters from the non-tokenized text string corresponding to the keyword object matched and then repeats the comparison step with the second keyword object. The match list is then updated with the second keyword object if the portion of the non-tokenized text string matches the second keyword object.
The foregoing has outlined some of the more pertinent objects and features of the present invention. These objects should be construed to be merely illustrative of some of the more prominent features and applications of the invention. Many other beneficial results can be attained by applying the disclosed invention in a different manner or modifying the invention as will be described. Accordingly, other objects and a fuller understanding of the invention may be had by referring to the following Detailed Description of the Preferred Embodiment.
REFERENCES:
patent: 4972349 (1990-11-01), Kleinberger
patent: 5062074 (1991-10-01), Kleinberger
patent: 5099426 (1992-03-01), Carlgren et al.
patent: 5276616 (1994-01-01), Kuga et al.
patent: 5297039 (1994-03-01), Kanaegami et al.
patent: 5428778 (1995-06-01), Brookes
patent: 5604901 (1997-02-01), Kelley et al.
patent: 5724256 (1998-03-01), Lee et al.
patent: 5724593 (1998-03-01), Hargrave, III et al.
patent: 5761655 (1998-06-01), Hoffman
patent: 5764974 (1998-06-01), Walster et al.
patent: 6026398 (2000-02-01), Brown et al.
patent: 6078924 (2000-06-01), Ainsbury et al.
patent: 8069474 (1996-03-01), None
patent: 8137898 (1996-05-01), None
patent: 9282328 (1997-10-01), None
patent: 10-074250 (1998-01-01), None
patent: 9502221 (1995-01-01), None
Houchin Alice Maria
Wood Douglas Andrew
Black Thomas
Burwell Joseph R.
Chen Te Yu
International Business Machines - Corporation
Judson David
LandOfFree
Method for searching non-tokenized text and tokenized text... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for searching non-tokenized text and tokenized text..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for searching non-tokenized text and tokenized text... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2491482