Class description generation for clustering and categorization

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S001000, C707S711000, C707S737000

Reexamination Certificate

active

07813919

ABSTRACT:
A class is to be characterized of a probabilistic classifier or clustering system that includes probabilistic model parameters. For each of a plurality of candidate words or word combinations, divergence of the class from other classes is computed based on one or more probabilistic model parameters profiling the candidate word or word combination. One or more words or word combinations are selected for characterizing the class as those candidate words or word combinations for which the class has substantial computed divergence from the other classes.

REFERENCES:
patent: 5703964 (1997-12-01), Menon et al.
patent: 5857179 (1999-01-01), Vaithyanathan et al.
patent: 6104835 (2000-08-01), Han
patent: 6137911 (2000-10-01), Zhilyaev
patent: 6424971 (2002-07-01), Kreulen et al.
patent: 6862586 (2005-03-01), Kreulen et al.
patent: 6931347 (2005-08-01), Boedi et al.
patent: 7031909 (2006-04-01), Mao et al.
patent: 7085771 (2006-08-01), Chung et al.
patent: 2003/0167163 (2003-09-01), Glover et al.
patent: 2003/0233232 (2003-12-01), Fosler-Lussier et al.
patent: 2005/0187892 (2005-08-01), Goutte et al.
patent: 2006/0093188 (2006-05-01), Blake et al.
patent: 2006/0136410 (2006-06-01), Gaussier et al.
patent: 2006/0287848 (2006-12-01), Li et al.
patent: 2007/0005340 (2007-01-01), Goutte et al.
patent: 2007/0005639 (2007-01-01), Gaussier et al.
patent: 2007/0067289 (2007-03-01), Novak
Dhillon, I. Mallela, S. Kumar, R. “A divisive information-theoretic feature clustering algorithm for text classification” Journal of Machine Learning Research 2003, 1265-1287.
Tomokiyo, T. Hurst, M. “A language model approach to keyphrase extraction” Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment, p. 33-40, Jul. 12, 2003.
Dobrokhotov, P. et al. “Combining NLP and probabilistic categorization for document and term selection for Swiss-Prot medical annotation” Bioinformatics, vol. 19, pp. i91-i94, 2003.
Dobrokhotov et al., “Combining NLP and probabilistic categorization for document and term selection for Swiss-Prot medical annotation,”Bioinformatics, vol. 19 Suppl. 1, pp. i91-i94, 2003.
Radev et al., “Automatic summarization of search engine hit lists,” ACL Workshop on Recent Advances in NLP and IR, pp. 99-109, 2000.
Dobrokhotov et al., “Combining NLP and Probabilistic Categorisation for . . . ,” Oxford University Press, pp. 1-3, (2001).
Gaussier et al, “A Hierarchical Model for Clustering and Categorising Documents,” Advances in Information Retrieval, pp. 229-247, (2002).
Goutte et al., “Corpus-Basedvs.Model-Based Selection of Relevant Features,” Proceedings of CORIA04, pp. 75-88, (2004).
Glover et al., “Using Web Structure for Classifying and Describing Web Pages,” published at WWW2002, 8 pp, (2002).
Glover et al., “Inferring Hierarchical Descriptions,” published at CIKM2002, 8 pp, (2002).
Popescul et al., “Automatic Labeling of Document Clusters,” 16 pp, at http://www.cis.upenn.edu/˜popescul/Publications/popescal00labeling.pdf, (2000).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Class description generation for clustering and categorization does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Class description generation for clustering and categorization, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Class description generation for clustering and categorization will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4238076

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.