Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-10-30
2004-02-10
Corrielus, Jean M. (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C705S002000
Reexamination Certificate
active
06691122
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates generally to the field of information processing, and, more particularly, to using artificial intelligence to compile information.
To facilitate searching, sorting, combining, and various other functions, information may be stored electronically in a database. A database is generally structured as a set of records with each record containing one or more fields. Unlike a data structure, such as an array, in which all the array elements represent the same type of information, each field in a record typically represents a different type of information. A record may be accessed as a collection of fields or, alternatively, the various fields in a record may be accessed individually by name.
Although databases are generally characterized by their highly organized structure of records and fields, the information to be stored in a database may not be as highly organized. For example, consider a database for storing résumés for job candidates. Most résumés contain the following types of information: demographic information (e.g., name, address, telephone number, electronic mail address, etc.), education information, and job experience information. Nevertheless, while these various types of information are generally present in most résumés, they may not be arranged in a standardized format. As a result, it may be difficult to store candidate résumés in a database in a consistent manner such that a user may search, sort, or otherwise process the résumés according to some criterion.
Consequently, there exists a need for improvements in compiling and organizing information such that the information may be more readily accessed and processed when saved in, for example, a database.
SUMMARY OF THE INVENTION
Embodiments of the present invention may include methods, systems, and computer program products for compiling information into information categories using an expert system. For example, multiple information categories may be defined and, for each information category, a fact table may be provided that contains facts and rules associated with the respective information category. The information to be compiled may be encoded as multiple data strings and received as a digital data stream. An inference engine is then used to process the facts, the rules, and the data strings for at least one of the fact tables to associate one or more of the data strings with at least one of the information categories. The data strings that are associated with the information categories may then be arranged in a file based on their information category associations.
By using the inference engine and fact tables to associate data strings with information categories, non-standardized information may be organized by category and then arranged in a file based on these categories. The resulting file may be more readily processed by other applications because the information contained therein may be arranged in a consistent, predetermined manner.
The fact tables may be viewed as a knowledge base and the inference engine and fact tables together may be viewed as an expert system for associating information with information categories. Because rules may be developed for the expert system to account for various organizations of data strings in the received data stream, a programmatic approach to categorizing the data strings need not be followed. For example, when processing information from a résumé, the expert system need not rely on the candidate's name being at the beginning of the résumé or the use of specific subtitles, such as “EXPERIENCE” or “EDUCATION” in the body of the résumé.
In particular embodiments of the present invention, a determination may be made whether data strings are encoded using the American Standard Code for Information Interchange (ASCII) coding scheme. If the data strings are encoded using a non-ASCII coding scheme, then the data strings may be translated into ASCII to facilitate further processing.
In embodiments of the present invention, the facts may include, but are not limited to, names, words, phrases, acronyms, terms of art, number strings (e.g., zip codes, area codes), geographic names, etc. The rules may comprise fact match rules, pattern match rules, and proximity search rules.
In further embodiments of the present invention, the inference engine may process the facts, the fact match rules, and the data strings for one or more of the fact tables to associate data strings with the information categories. The inference engine may also process the pattern match rules and the data strings for one or more of the fact tables to associate data strings with the information categories. The pattern match rules may include rules related to sequences of data strings. Finally, the inference engine may process the proximity search rules and the data strings for one or more of the fact tables to associate data strings with the information categories. The proximity search rules may include rules related to the relative location of data strings in the data stream. For example, when processing information from a résumé, if the term “GPA” is located near the term “EDUCATION,” then it may be interpreted as “Grade Point Average” and may be associated with an education category. Alternatively, if the term “GPA” is located closer to the term “EXPERIENCE,” then it may be interpreted as an acronym for a skill, job responsibility, etc. and may be associated with an employment category.
In particular embodiments of the present invention, the information categories may be tailored for compiling information from a résumé. Accordingly, the information categories may include a demographic category, a skill set category, an education and employment category, and a career progression category. The number of occurrences for each data string that is associated with the skill set category may be determined and the number of occurrences for each data string that is associated with the career progression category and corresponds to job position title information may be determined. These “hit counts” may be indicative of the relative importance of a particular candidate's skills and job titles.
In further embodiments of the present invention, a qualitative rank may be determined for each data string that is associated with the career progression category and corresponds to job position title information or job responsibility information. These qualitative rankings may be based on weights assigned to job position titles and job responsibilities in the fact tables. The weights assigned to the job position titles and job responsibilities in the fact tables may be dynamically set by a user based the type of qualifications sought in a job candidate.
In still further embodiments of the present invention, in addition to the data strings that are associated with the information categories, the number of occurrences for each data string that is associated with the skill set category, the number of occurrences for each data string that is associated with the career progression category and corresponds to job position title information, and the qualitative rank for each data string that is associated with the career progression category and corresponds to job position title information or job responsibility information may also be arranged in a file based on the associations between the data strings and the information categories.
The file containing the data strings associated with the information categories may be an extensible markup language (XML) file. Advantageously, XML may allow the file to be described in terms of logical parts or elements. For example, the information categories and the various types of information that belong to each category may be represented in the XML file as specific elements.
In further embodiments of the present invention, the data strings may be added to the XML file in their received arrangement. For example, if the data strings comprise information from a résumé, then the entire résumé, without any processing or formatting
Evrenidis Basil
Grundner James A.
Witte Curt J.
Corrielus Jean M.
Myers Bigel Sibley & Sajovec P.A.
Peopleclick.com, Inc.
LandOfFree
Methods, systems, and computer program products for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods, systems, and computer program products for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods, systems, and computer program products for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3344644