Data processing: database and file management or data structures – Database design – Data structure types
Patent
1996-07-12
1998-11-03
Lintz, Paul R.
Data processing: database and file management or data structures
Database design
Data structure types
704 2, 704 8, 704 9, 704 10, 707 1, 707532, 707536, G06F 1730
Patent
active
058324801
ABSTRACT:
Descriptive canonical forms of entity types are created by scanning one or more documents in a database of a computer system to identify one or more proper names that appear in the documents as raw names. Each of the raw names has zero or more proper names, zero or more medial substrings, zero or more leading substrings, and zero or more trailing substrings. The raw names of one or more documents are "cleaned" and "split" until certain "cleaning and splitting conditions" are no longer met to obtain a list of clean and split candidate names. Anchor names are selected from the list that unambiguously represent an entity type. The anchor names have one or more entity-type attribute values. Variant names, clean and split candidate names having one or more shared attribute (values) with the anchor name, are combined with the anchor name to create an equivalence group of names that refer to the same entity. A canonical form is generated for the group from a subset of the anchor name attributes. A canonical form is created in this manner for all of the clean and split candidate names on the list.
REFERENCES:
patent: 4864501 (1989-09-01), Kucera et al.
patent: 4868750 (1989-09-01), Kucera et al.
patent: 5287278 (1994-02-01), Rau
patent: 5510981 (1996-04-01), Berger et al.
"NameFinder: Software that finds Names in Text" by Phil Hayes, pp. 762-774 from RIAO 94, Conference Proceedings, Intelligent Multimedia Information Retrieval Systems and Management, Rockefeller Univ. New York, NY, vol. 1.
Web Pages for NameTag at http://projects.sra.com
ametag.
D. D. McDonald, "Internal and External Evidence in the Identification and Semantic Categorization of Proper Names," in B. Boguraev and J. Pustejovsky, eds. Acquisition of Lexical Knowledge from Text: Proceedings of a Workshop Sponsored by the special interest Group on the Lexicon of the Association for Computational Linguistics, pp. 32-43, Columbus, Ohio, 1993.
W. Paik, E.D. Liddy, E. Yu and M. McKenna, "Categorizing and standardizing Proper Nouns for efficient information Retrieval," in B. Boguraev and J. Pustejovsky, eds. Axquisition of Lexical Knowledge from Text: Proceedings of a Workshop Sponsored by the Special Interest Group on the Lexicon of the Association for Computational Linguistics, 154-160, Columbus, Ohio, 1993.
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures and Algorithms, Prentice Hall, 1992, pp. 106-109, 113-116, 138-142.
B. W. Kernighan and D. M. Ritchie, The C Programming Language, 2nd Edition, PTR Prentice Hall, Englewood Clifs, New Jersey, 1988, pp. 152-153, 249-250.
P. Hayes, "NameFinder: Software that finds Names in Text", Proceedings of RIAO '94, vol. 1, pp. 762-774, Oct. 1994, New York. ISBN 2-905450-05-3.
S. Coates-Stephens, "The Analysis and Acquisition of Proper Names for the Understanding of Free Text," Computers and the Humanities, Vold. 26, pp. 441-456, 1993.
Byrd, Jr. Roy Jefferson
Choi Misook A.
Ravin Yael
Wacholder Faye Nina
Colbert Ella A.
International Business Machines - Corporation
Lintz Paul R.
Percello Louis J.
LandOfFree
Using canonical forms to develop a dictionary of names in a text does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Using canonical forms to develop a dictionary of names in a text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Using canonical forms to develop a dictionary of names in a text will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-703986