Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2006-04-11
2006-04-11
Rimell, Sam (Department: 2164)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C704S275000
Reexamination Certificate
active
07028038
ABSTRACT:
A method for electronically generating high-quality feature vectors that can be used in connection with electronic data processing systems implementing Maximum Entropy or other statistical models to accurately normalize abbreviations in text such as medical records. An abbreviation database and a training text database are provided. The abbreviation database includes abbreviation data representative of abbreviations and associated expansions to be normalized. The training text database includes a corpus of text having expansions of the abbreviations to be normalized. The corpus of text is processed as a function of the abbreviation data to identify the expansions in the corpus of text. Context information describing the context of the text in which the expansions were identified is generated. A set of feature vectors is also stored. Each feature vector including the context information generated for the associated expansion identified in the corpus of text.
REFERENCES:
patent: 4817156 (1989-03-01), Bahl et al.
patent: 4831550 (1989-05-01), Katz
patent: 4849925 (1989-07-01), Peckerar et al.
patent: 5293584 (1994-03-01), Brown et al.
patent: 5467425 (1995-11-01), Lau et al.
patent: 6049767 (2000-04-01), Printz
patent: 6055494 (2000-04-01), Friedman
patent: 6182029 (2001-01-01), Friedman
patent: 6535849 (2003-03-01), Pakhomov et al.
patent: 2002/0055919 (2002-05-01), Mikheev
patent: 2002/0188421 (2002-12-01), Tanigaki et al.
patent: 2003/0105638 (2003-06-01), Taira
patent: 2003/0120640 (2003-06-01), Ohta et al.
patent: 2003/0154208 (2003-08-01), Maimon et al.
patent: 2004/0044548 (2004-03-01), Marshall et al.
patent: 2004/0083452 (2004-04-01), Minor et al.
Black,An Experiment in Computational Discrimination of English Word Senses,IBM Journal of Research and Development, 32(2), pp. 185-194 (1988).
Schütze,Automatic Word Sense Discrimination,Computational Linguistics, 24(1), pp. 97-123 (1998).
Hearst,Noun Homograph Disambiguation Using Local Context In Large Text Corpora,In Proc., 7thAnnual Conference of the University of Waterloo Center for the New OED and Text Research, Oxford, pp. 1-15 (1991).
Yarowsky,Unsupervised Word Sense Disambiguation Rivaling Supervised Methods,In Proceedings of the Association for Computational Linguistics, pp. 189-196 (1995).
Open NLP Maxent—The Maximum Entropy Framework Internal Webpage printed Jun. 30, 2003, 10 pgs.
Faegre & Benson LLP
Mayo Foundation for Medical Education and Research
Rimell Sam
LandOfFree
Method for generating training data for medical text... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for generating training data for medical text..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for generating training data for medical text... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3595919