System and method for clustering unstructured documents

Data processing: database and file management or data structures – Database and file access – Preparing data for information retrieval

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S750000, C707S777000

Reexamination Certificate

active

07809727

ABSTRACT:
A system and method for clustering unstructured documents is provided. Documents having terms with frequencies of occurrence that satisfy upper and lower edge conditions are selected. Concepts are generated for the selected documents. The selected documents are grouped into clusters of the documents. A weight for each of the clusters is evaluated. A similarity value is determined from the frequencies of occurrence for at least one of the terms from the concepts and the cluster weights for each selected document. Each selected document is assigned into one such cluster based on the similarity value of the selected document.

REFERENCES:
patent: 5056021 (1991-10-01), Ausborn
patent: 5371673 (1994-12-01), Fan
patent: 5488725 (1996-01-01), Turtle et al.
patent: 5524177 (1996-06-01), Suzuoka
patent: 5675819 (1997-10-01), Schuetze
patent: 5819258 (1998-10-01), Vaithyanathan et al.
patent: 5857179 (1999-01-01), Vaithyanathan et al.
patent: 5864846 (1999-01-01), Voorhees et al.
patent: 5940821 (1999-08-01), Wical
patent: 5987446 (1999-11-01), Corey et al.
patent: 6137545 (2000-10-01), Patel et al.
patent: 6173275 (2001-01-01), Caid et al.
patent: 6349307 (2002-02-01), Chen
patent: 6360227 (2002-03-01), Aggarwal et al.
patent: 6389436 (2002-05-01), Chakrabarti et al.
patent: 6415283 (2002-07-01), Conklin
patent: 6446061 (2002-09-01), Doerre et al.
patent: 6460034 (2002-10-01), Wical
patent: 6484168 (2002-11-01), Pennock et al.
patent: 6510406 (2003-01-01), Marchisio
patent: 6560597 (2003-05-01), Dhillon et al.
patent: 6611825 (2003-08-01), Billheimer et al.
patent: 6629097 (2003-09-01), Keith
patent: 6675159 (2004-01-01), Lin et al.
patent: 6675164 (2004-01-01), Kamath et al.
patent: 6701305 (2004-03-01), Holt et al.
patent: 6711585 (2004-03-01), Copperman et al.
patent: 6757646 (2004-06-01), Marchisio
patent: 6778995 (2004-08-01), Gallivan
patent: 6816175 (2004-11-01), Hamp et al.
patent: 6820081 (2004-11-01), Kawai et al.
patent: 6862710 (2005-03-01), Marchisio
patent: 6978274 (2005-12-01), Gallivan et al.
patent: 7051017 (2006-05-01), Marchisio
patent: 2003/0093395 (2003-05-01), Shetty et al.
patent: 2003/0217047 (2003-11-01), Marchisio
patent: 2005/0010555 (2005-01-01), Gallivan
patent: 2005/0021517 (2005-01-01), Marchisio
patent: 2005/0022106 (2005-01-01), Kawai et al.
“Fuzzy Concep Graph and Application in Web Document Clustering” —Chen An, Chen Ning, Weijia Jia and Sanding Luo—2001 IEEE (pp. 101-106).
“Thematic Mapping—From Unstructured Documents to Taxonomies” —Christina Yip Chung, Raymond Lieu, Jinhui Liu, Alpha Luk, Jianchang Mao, and Prabhakar Raghavan—2002—ACM (pp. 608-610).
“Overview of Mondou Web Search Engine Using Text Mining and Information Visualizing technologies” —Hiroyuki Kawano—2001—IEEE (pp. 234-241).
“Justice: A Judicial Search Tool Using Intelligent Concept Extraction” —James Osborn and Leon Sterling—1999—ACM (pp. 173-181).
D. Sullivan, “Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing and Sales,” Ch. 1-3, John Wiley & Sons, New York, NY (2001).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for clustering unstructured documents does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for clustering unstructured documents, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for clustering unstructured documents will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4202233

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.