Method and system for clustering identified forms

Data processing: database and file management or data structures – Database and file access – Post processing of search results

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S723000, C707S728000, C707S736000, C707S737000, C707S758000

Reexamination Certificate

active

07996390

ABSTRACT:
A method is provided for organizing a plurality of documents that include forms. An initial set of clusters is defined for the plurality of documents. The initial set of clusters is reclustered based on similarity values calculated in multiple feature spaces. For example, a first feature space may be associated with a content of a document while a second feature space may be associated with a content of a form associated with the document. Each cluster has an associated centroid vector in each feature space that is used to represent the cluster. The similarity between the document and each cluster is calculated in both feature spaces. Each document is assigned to the cluster whose centroid is most similar. The cluster centroids may be recalculated and the process repeated until the cluster assignments become stable.

REFERENCES:
patent: 6754873 (2004-06-01), Law et al.
patent: 7080073 (2006-07-01), Jiang et al.
patent: 7213198 (2007-05-01), Harik
patent: 2002/0169770 (2002-11-01), Kim et al.
patent: 2003/0220912 (2003-11-01), Fain et al.
patent: 2005/0097436 (2005-05-01), Kawatani
patent: 2005/0203924 (2005-09-01), Rosenberg
patent: 2006/0129446 (2006-06-01), Ruhl et al.
patent: 2006/0200478 (2006-09-01), Pasztor
patent: 2006/0230033 (2006-10-01), Halevy et al.
patent: 2007/0100812 (2007-05-01), Simske et al.
patent: 2007/0100862 (2007-05-01), Reddy et al.
patent: 2007/0112898 (2007-05-01), Evans et al.
patent: 2008/0154942 (2008-06-01), Tsai et al.
Huang et al., “Multi-type Features Based Web Document Clustering”, Springer Berlin / Heidelberg, vol. 3306/2004, pp. 253-265, 2004. Download: http://www.springerlink.com/content/te7qn81416wqy7g6/.
Barbosa et al., “Searching for Hidden-Web Database”, Eighth International Workshop on the Web and Database, Jun. 16-17, 2005, pp. 1-6. Download: http://webdb2005.uhasselt.be/papers/1-1.pdf.
De Lucia et al., “Using a Competitive Clustering Algorithm to Comprehend Web Applications”, IEEE, 2006, pp. 1-8. Download: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4027204.
Barbosa et al., “An Adaptive Crawler for Locating Hidden-Web Entry Points,” in Proceedings of the 16th International World Wide Web Conference (WWW), Banff, Alberta, Canada, May 8-12, 2007, 10 pages.
Barbosa et al., “Combining Classifiers to Identify Online Databases,” in Proceedings of the 16th International World Wide Web Conference (WWW), Banff, Alberta, Canada, May 8-12, 2007, 9 pages.
Barbosa et al., “Organizing Hidden-Web Databases by Clustering Visible Web Documents,” in Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE), Istanbul, Apr. 14, 2007, 10 pages.
Barbosa et al., “Automatically Constructing a Directory of Molecular Biology Databases,” in Proceedings of the International Workshop on Data Integration in the Life Sciences 2007 (DILS), Philadelphia, Jun. 27, 2007, 10 pages.
Barbosa et al., “Searching for Hidden-Web Databases,” in Proceedings of the 8th ACM SIGMOD International Workshop on Web and Databases (WebDB), Baltimore, Maryland, Jun. 16-17, 2005, 6 pages.
Barbosa et al., “Siphoning Hidden-Web Data through Keyword-Based Interfaces,” in Proceedings Of the 19th Brazilian Symposium on Databases (SBBD), Brasilia, Oct. 18, 2004, 13 pages.
Barbosa et al., “Automatically Constructing Collections of Online Databases,” The ACM 15th Conference on Information and Knowledge Management, Arlington, Virginia, pp. 796-797, Nov. 5, 2006.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for clustering identified forms does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for clustering identified forms, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for clustering identified forms will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2722963

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.