Method for adapting a K-means text clustering to emerging data

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method for adapting a K-means text clustering to emerging data Method for adapting a K-means text clustering to emerging data

: 2000-09-26
: 2008-09-30
: Hong, Stephen (Department: 2178)
: Data processing: presentation processing of document, operator i
: Presentation processing of document
: Layout

: Reexamination Certificate
: active
: 07430717
: ABSTRACT:
A method and structure for clustering documents in datasets which include clustering first documents and a first dataset to produce first document classes, creating centroid seeds based on the first document classes, and clustering second documents in a second dataset using the centroid seeds, wherein the first dataset and the second dataset are related. The clustering of the first documents in the first dataset forms a first dictionary of most common words in the first dataset and generates a first vector space model by counting, for each word in the first dictionary, a number of the first documents in which the word occurs, and clusters the first documents in the first dataset based on the first vector space model, and further generates a second vector space model by counting, for each word in the first dictionary, a number of the second documents in which the word occurs. Creation of the centroid seeds includes classifying second vector space model using the first document classes to produce a classified second vector space model and determining a mean of vectors in each class in the classified second vector space model, the mean includes the centroid seeds.

REFERENCES:
patent: 5317507 (1994-05-01), Gallant
patent: 5675819 (1997-10-01), Schuetze
patent: 5832182 (1998-11-01), Zhang et al.
patent: 5857179 (1999-01-01), Vaithyanathan et al.
patent: 5864855 (1999-01-01), Ruocco et al.
patent: 5999927 (1999-12-01), Tukey et al.
patent: 6012058 (2000-01-01), Fayyad et al.
patent: 6298174 (2001-10-01), Lantrip et al.
“Computer Oriented Approaches To Pattern Recognition,” by William S. Meisel, Academic Press (1972), pp. 144-146.
Cutting et al., “Scatter/Gather: A Cluster-Based Approach to Browsing Large Document Collections”, Proc. Of the Annual International ACM SIGIR Conference, vol. 15, No. 21, 1992, pp. 318-329.
Al-Daoud et al., “New Methods for the Initialisation of Clusters”, Pattern Recognition Letters Elservier Netherlands, vol. 17, No. 5. 1996, pp. 451-454.
Steinbach et al., “A Comparison of Document Clustering Techniques”, Technical Report, 2000, pp. 1-20.
Pena et al., “An Empirical Comparison of four initialization methods for the K-Means Algorithm”, Pattern Recognition Letters, vol. 20, No. 10, 1999, pp. 1027-1040.
Jain et al., “Data Clustering: A Review”, ACM Computing Surveys, vol. 31, No. 3, September, pp. 264-323, 1999.
Marina Meila, “An Experimental Comparison of Several Clustering and Initialization Methods”, Technical Report, 1998, pp. 1-22.
Bollacker et al., “A Scalable Method for Classifier Knowledge Reuse”, International Conference of Houston, TX, vol. 3, 1997, pp. 1474-1478.
Bradely et al., “Refining Initial Points for K-Means Clustering” International Conference, 1998, pp. 91-99.

Affiliated with

Spangler William Scott

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Gibb & Rahman, LLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Hong Stephen

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

International Business Machines - Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Stork Kyle R

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for adapting a K-means text clustering to emerging data does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for adapting a K-means text clustering to emerging data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for adapting a K-means text clustering to emerging data will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3987600

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure