Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2002-07-25
2008-08-05
Rones, Charles (Department: 2164)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
07409404
ABSTRACT:
Methods, apparatus and systems to generate from a set of training documents a set of training data and a set of features for a taxonomy of categories. In this generated taxonomy the degree of feature overlap among categories is minimized in order to optimize use with a machine-based categorizer. However, the categories still make sense to a human because a human makes the decisions regarding category definitions. In an example embodiment, for each category, a plurality of training documents selected using Web search engines is generated, the documents winnowed to produce a more refined set of training documents, and a set of features highly differentiating for that category within a set of categories (a supercategory) extracted. This set of training documents or differentiating features is used as input to a categorizer, which determines for a plurality of test documents the plurality of categories to which they best belong.
REFERENCES:
patent: 6360227 (2002-03-01), Aggarwal et al.
Chojnacki Mellissa M
Herzberg Louis P.
International Business Machines - Corporation
Rones Charles
LandOfFree
Creating taxonomies and training data for document... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Creating taxonomies and training data for document..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Creating taxonomies and training data for document... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4010332