Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-05-26
2004-04-27
Breene, John (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C706S021000, C707S793000
Reexamination Certificate
active
06728695
ABSTRACT:
CROSS-REFERENCE TO RELATED APPLICATIONS
Not Applicable
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
Not Applicable
REFERENCE TO A “MICROFICHE APPENDIX”
Not Applicable
BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates in general to a method and apparatus for making predictions about entities represented in text documents. It more particularly relates to a more highly effective and accurate method and apparatus for the analysis and retrieval of text documents, such as employment résumés, job postings or other documents contained in computerized databases.
2. Background Art
The challenge for personnel managers is not just to find qualified people. A job change is expensive for the old employee, the new employee, as well as the employer. It has been estimated that the total cost for all three may, in some instances, be as much as $50,000. To reduce these costs, it is important for personnel managers to find well matched employees who will stay with the company as long as possible and who will rise within the organization.
Personnel managers once largely relied on résumés from unsolicited job applications and replies to newspaper help-wanted advertisements. This presented a number of problems. One problem has been that the number of résumés from these sources can be large and can require significant skilled-employee time even for sorting. Résumés received unsolicited or in response to newspaper advertisements would present primarily a local pool of job applicants. Frequently most of the résumés are from people unsuited for the position. Also, a résumé oftentimes only described an applicant's past and present and did not predict longevity or promotion path.
One attempt at finding a solution to the oftentimes perplexing problems of locating qualified, long-term employees has been to resort to outside parties, such as temporary agencies and head-hunters. The first temporary agency started in approximately 1940 (Kelly Girl, now Kelly Services having a website at www.kellyservices.com) by supplying lower-level employees to business. Temporary agencies now offer more technical and high-level employees. The use of head-hunters and recruiters for candidate searches is commonplace today. While this approach to finding employees may simplify hiring for a business, it does not simplify the problem of efficiently finding qualified people. It merely moves the problem from the employer to the intermediary. It does not address finding qualified employees who will remain with, and rise within, the company.
In recent years, computer bulletin boards and internet newsgroups have appeared, enabling a job-seeker to post a résumé or an employer to post a job posting, which is an advertisement of a job opening. These bulletin boards and internet newsgroups are collectively known as “job boards,” such as those found at services identified as misc.jobs.resumes and misc.jobs.offered. Presently, World Wide Web sites were launched for the same purpose. For example, there are websites at www.jobtrak.com and www.monster.com.
On internet job boards, the geographic range of applicants has widened, and the absolute number of résumés for a typical personnel manager to examine has greatly increased. At the same time, the increasing prevalence of submission of résumés in electronic format in response to newspaper advertisements and job board postings has increased the need to search in-house computerized databases of résumés more efficiently and precisely. With as many as a million résumés in a database such as the one found at the website www.monster.com, the sheer number of résumés to review provides a daunting task. Because of the ubiquity of computer databases, the need to search efficiently and to select a single document or a few documents out of many, has become a substantial problem. Such a massive text document retrieval problem is not by any means limited to résumés. The massive text document retrieval problem has been addressed in various ways.
For example, reference may be made to the following U.S. Pat. No.: 4,839,853, COMPUTER INFORMATION RETRIEVAL USING LATENT SEMANTIC STRUCTURE; U.S. Pat. No. 5,051,947, HIGH-SPEED SINGLE-PASS TEXTUAL SEARCH PROCESSOR FOR LOCATING EXACT AND INEXACT MATCHES OF A SEARCH PATTERN IN A TEXTUAL STREAM; U.S. Pat. No. 5,164,899, METHOD AND APPARATUS FOR COMPUTER UNDERSTANDING AND MANIPULATION OF MINIMALLY FORMATTED TEXT DOCUMENTS; U.S. Pat. No. 5,197,004, METHOD AND APPARATUS FOR AUTOMATIC CATEGORIZATION OF APPLICANTS FROM RESUMES; U.S. Pat. No. 5,301,109, COMPUTERIZED CROSS-LANGUAGE DOCUMENT RETRIEVAL USING LATENT SEMANTIC INDEXING; U.S. Pat. No. 5,559,940, METHOD AND SYSTEM FOR REAL-TIME INFORMATION ANALYSIS OF TEXTUAL MATERIAL; U.S. Pat. No. 5,619,709, SYSTEM AND METHOD OF CONTEXT VECTOR GENERATION AND RETRIEVAL; U.S. Pat. No. 5,592,375, COMPUTER-ASSISTED SYSTEM FOR INTERACTIVELY BROKERING GOODS FOR SERVICES BETWEEN BUYERS AND SELLERS; U.S. Pat. No. 5,659,766, METHOD AND APPARATUS FOR INFERRING THE TOPICAL CONTENT OF A DOCUMENT BASED UPON ITS LEXICAL CONTENT WITHOUT SUPERVISION; U.S. Pat. No. 5,796,926, METHOD AND APPARATUS FOR LEARNING INFORMATION EXTRACTION PATTERNS FROM EXAMPLES; U.S. Pat. No. 5,832,497, ELECTRONIC AUTOMATED INFORMATION EXCHANGE AND MANAGEMENT SYSTEM; U.S. Pat. No. 5,963,940, NATURAL LANGUAGE INFORMATION RETRIEVAL SYSTEM AND METHOD; AND U.S. Pat. No. 6,006,221, MULTILINGUAL DOCUMENT RETRIEVAL SYSTEM AND METHOD USING SEMANTIC VECTOR MATCHING.
Also, reference may be made to the following publications: “Information Extraction using HMMs and Shrinkage” Dayne Freitag and Andrew Kachites McCallum, Papers from the AAAI-99 Workshop on Machine Learning for Information Extraction, AAAI Technical Report WS-99-11, July 1999; “Learning Hidden Markov Model Structure for Information Extraction,” Kristie Seymore, Andrew McCallum, and Ronald Rosenfeld, Papers from the AAAI-99 Workshop on Machine Learning for Information Extraction, AAAI Technical Report WS-99-11, July 1999; “Boosted Wrapper Induction” Dayne Freitag and Nicholas Kushmerick, to appear in Proceedings of AAAI-2000, July 2000; “Indexing by Latent Semantic Analysis” Scott Deerwester, et al, Journal of the Am. Soc. for Information Science, 41(6):391-407, 1990; and “Probabilistic Latent Semantic Indexing,” by Thomas Hofman, EECS Department, UC Berkeley, Proceedings of the Twenty-Second Annual SIGIR Conference on Research and Development in Information Retrieval.
Each one of the foregoing patents and publications are incorporated herein by reference, as if fully set forth herein.
Early document searches were based on keywords as text strings. However, in a large database, simple keyword searches oftentimes return too many irrelevant documents, because many words and phrases have more than one meaning (polysemy). For example, being a secretary in the state department is not the same as being Secretary of State.
If only a few keywords are used, large numbers of documents are returned. Keyword searches may also miss many relevant documents because of synonymy. The writer of a document may use one word for a concept, and the person who enters the keywords uses a synonym, or even the same word in a different form, such as “Mgr” instead of “Manager.” Another problem with keyword searches is the fact that terms cannot be readily weighted.
Keyword searches can be readily refined by use of Boolean logic, which allows the use of logical operators such as AND, NOT, OR, and comparative operators such as GREATER THAN, LESS THAN, or EQUALS. However, it is difficult to consider more than a few characteristics with Boolean logic. Also, the fundamental problems of a text-string keyword search still remain a concern. At the present time, most search engines still use keyword or Boolean searches. These searches can become complex, but they currently suffer from the intrinsic limitations of keyword searches. In short, it is not possible to find a word that is not present in a text document, and the terms cannot be weighed.
In an attempt to overcome these problems, natural language
Chow Edmond D.
Crooks Theodore J.
Freitag Dayne B.
Gopinathan Krishna M.
Laffoon Mark A.
Breene John
Burning Glass Technologies, LLC
Devinsky Paul
Wassum Luke S
LandOfFree
Method and apparatus for making predictions about entities... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for making predictions about entities..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for making predictions about entities... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3204901