Method, apparatus, and system for clustering and classification

Data processing: artificial intelligence – Machine learning

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

08010466

ABSTRACT:
The invention provides a method, apparatus and system for classification and clustering electronic data streams such as email, images and sound files for identification, sorting and efficient storage. The method further utilizes learning machines in combination with hashing schemes to cluster and classify documents. In one embodiment hash apparatuses and methods taxonomize clusters. In yet another embodiment, clusters of documents utilize geometric hash to contain the documents in a data corpus without the overhead of search and storage.

REFERENCES:
patent: 1261167 (1918-04-01), Russell
patent: 5032987 (1991-07-01), Broder et al.
patent: 5909677 (1999-06-01), Broder et al.
patent: 5953503 (1999-09-01), Mitzenmacher et al.
patent: 5974481 (1999-10-01), Broder
patent: 5991808 (1999-11-01), Broder et al.
patent: 6073135 (2000-06-01), Broder et al.
patent: 6088039 (2000-07-01), Broder et al.
patent: 6119124 (2000-09-01), Broder et al.
patent: 6195698 (2001-02-01), Lillibridge et al.
patent: 6269362 (2001-07-01), Broder et al.
patent: 6286006 (2001-09-01), Bharat et al.
patent: 6292762 (2001-09-01), Moll et al.
patent: 6385609 (2002-05-01), Barshefsky et al.
patent: 6389436 (2002-05-01), Chakrabarti et al.
patent: 6438740 (2002-08-01), Broder et al.
patent: 6445834 (2002-09-01), Rising, III
patent: 6487555 (2002-11-01), Bharat et al.
patent: 6560600 (2003-05-01), Broder
patent: 6658423 (2003-12-01), Pugh et al.
patent: 6665837 (2003-12-01), Dean et al.
patent: 6687416 (2004-02-01), Wang
patent: 6711568 (2004-03-01), Bharat et al.
patent: 6732149 (2004-05-01), Kephart
patent: 7281664 (2007-10-01), Thaeler et al.
patent: 7295966 (2007-11-01), Barklund et al.
patent: 7333966 (2008-02-01), Dozier
patent: 7349386 (2008-03-01), Gou
patent: 7353215 (2008-04-01), Bartlett et al.
patent: 7370314 (2008-05-01), Minami et al.
patent: 7389178 (2008-06-01), Raz et al.
patent: 7406603 (2008-07-01), MacKay et al.
patent: 7451458 (2008-11-01), Tuchow
patent: 7463774 (2008-12-01), Wang et al.
patent: 7464026 (2008-12-01), Calcagno et al.
patent: 7477166 (2009-01-01), McCanne et al.
patent: 7487321 (2009-02-01), Muthiah et al.
patent: 7574409 (2009-08-01), Patinkin
patent: 2004/0049678 (2004-03-01), Walsmley et al.
patent: 2007/0112701 (2007-05-01), Chellapilla et al.
High-Performance XML Parsing and Validation with Permutation Phrase Grammar Parsers, Wei Zhang; van Engelen, R.A.; Web Services, 2008. ICWS '08. IEEE International Conference on Digital Object Identifier: 10.1109/ICWS.2008.101 Publication Year: 2008 , pp. 286-294.
Improving HTML Compression, Skibinski, P.; Data Compression Conference, 2008. DCC 2008 Digital Object Identifier: 10.1109/DCC.2008.74 Publication Year: 2008 , p. 545.
A novel robust text watermarking for word document, Yingli Zhang; Huaiqing Qin; Tao Kong; Image and Signal Processing (CISP), 2010 3rd International Congress on, vol. 1 Digital Object Identifier: 10.1109/CISP.2010.5648007 Publication Year: 2010 , pp. 38-42.
PCT Search Report & Written Opinion for PCT/US05/39718; Jul. 7, 2006; pp. 1-8.
Aery, M., et al.; eMailSift: Adapting Graph Mining Techniques for Email Classificaton; Technical Report CSE-2004-7; Jul. 2004; Univ. of Texas; Arlington, TX. pp. 1-26.
Metzger, J., et al.;“A Multiagent-Based Peer-to-Peer Network in Java for Distributed Spam Filtering”; German Research Center for Artificial Intelligence;2003; pp. 616-625.
Airoldi, E. “ScamSlam: An Architecture for Learning the Criminal Relations Behind Scam Spam”; May 2004; Pittsburgh, PA. Carnegie Mellon University; pp. 1-18.
Damiani, E., et al. “An Open Digest-based Technique for Spam . . . ”; Intl. Workshop on Security in Parallel and Distributed Systems; San Francisco, CA; Sep. 2004; pp. 1-6.
Wong, S., et al. “Vector Space Model of Information Retrieva . . . ”; Proceedings of the ACM Sigir Conf. on Research and Devel. in Info. Retrieval; GB; Pub.1984; pp. 167-185.
Vempala, S., et al.; “A spectral algorithm for learning mixture models”; Journal of Computer and System Sciences 68; Nov. 21, 2003; pp. 841-860.
Singhal, A., et al.; “Automatic Text Browsing Using Vector Space Model”; Dept. of Computer Science, Cornell Univ.; Ithaca, NY;1995; pp. 1-7.
Shivakumar, N., et al.; “SCAM: A Copy Detection Mechanism for Digital Documents”; Stanford Univ. Dept. of Computer Science, CA; 1995; pp. 1-9.
Shivakumar, N., et al.; “Finding near-replicas of documents on the web”; Stanford Univ. Dept. of Computer Science, CA; 1998; pp. 1-6.
Shivakumar, N., et al.; “Building a Scalable and Accurate Copy Detection Mechanism”; Dept. of Computer Sciences, Stanford, CA; 1996; pp. 1-9.
Shannon, C.; “A Mathematical Theory of Communication”; The Bell System Technical Journal, vol. 27, Jul., Oct. 1948; pp. 5-83.
Shakhnarovich, G., et al.; “Fast Pose Estimation With Parameter . . . ”; MIT, Artifical Intelligence Laboratory, Cambridge, MA.; Apr. 2003; pp. 1-12.
Sebastiani, F.; “Machine Learning in Automated Text . . . ”; ACM Computing Surveys, vol. 34, No. 1; Mar. 2002; pp. 1-47.
“Testing Search Indexing Using Anchor Text”; SearchTools.com; Search Tools Consulting; 2001-2003. p. 1.
Rubner, Y., et al.; “A Metric for Distributions with Applications . . . ”; Proceedings of 1998 IEEE International Conference in India; pp. 1-8.
Ross, S.; “Does Time Change All?”; Microsoft Research News & Highlights, 2004; pp. 1-2.
Rudin, L., et al.; “Feature-Oriented Image Enhancement with Shock Filters, I”; Dept. of Computer Science, CA Institute of Technology; approx. 1990; pp. 1-49.
Raman, V.; “Locality Preserving Dictionaries: Theory & Application to Clustering in Databases”; CS Division, UC Berkeley, CA.; 1999; pp. 337-345.
Philips, L.; “The Double Metaphone Search Algorithm”; C/C Users Journal; Jun. 2000; pp. 1-4.
Philips, L.; “Hanging on the Metaphone”; Computer Language; Dec. 1990; pp. 2-6.
Perona, P., et al.;“Scale-Space and Edge Detection Using Anisotropic Diffusion”; IEEE Transactions on Pattern Analysis & Machine Intel., vol. 12 No. 7, Jul. 1990; pp. 629-639.
Osher, S., et al.; “Fronts Propagating with Curvature-Dependent . . . ”; Journal of Computational Physics 79; 1988; pp. 12-49.
Mumford, D., et al.;“Optimal Approximations by Piecewise Smooth . . . ”; Communications on Pure and Applied Math . . . , vol. XLII; 1989; pp. 577-685.
Meyer, F., et al.; “Brushlets: A Tool for Directional Image Analysis . . . ”; Applied and Computational . . . ; 1997; No. HA970208; pp. 147-187.
Manber, U.; “Finding Similar Files in a Large File System”; Dept. of Computer Science; Univ. of AZ; Tucson, AZ; Oct. 1993; pp. 1-11.
Manasse, M. “Finding similar things quickly in large collections”; http://research.microsoft.com/en-us/projects/pageturner/similarity.aspx; pp. 1-3.
Kramer, H., et al.; “Iterations of a Non-Linear Transformation . . . ”; Pergamon Press; 1975; vol. 7; pp. 53-58.
Kovasznay, L., et al.; “Image Processing”; Proceedings of the IRE; Jan. 31, 1955; pp. 560-570.
Kokar, M.; “On Similarity Methods in Machine . . . ”; Dept. of Electrical & Computer . . . Northeastern Univ. Boston, MA; 2008; pp. 1-11.
Keller, M., et al.; “Theme Topic Mixture Model: A Graphical Model . . . ”; Switzerland; 2004; pp. 1-8.
Johnson, W., et al.; “Extensions of Lipschitz Mappings . . . ”; Contemporary Mathematics, vol. 26; 1984; pp. 189-206.
Joachims, T.; “Text Categorization with Support Vector Machines. . . ”; Computer Science Dept., Univ. of Dortmund; Nov. 1997; pp. 1-18.
Do, M., et al.; “The Contourlet Transform: An Efficient . . . ”; IEEE Transactions on Image Processing; Dec. 27, 2004; pp. 1-30.
Gionis, A., et al.; “Similarity Search in High Dimensions . . . ”; Proceedings of the 25th VLDB Conference, Scotlan

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method, apparatus, and system for clustering and classification does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method, apparatus, and system for clustering and classification, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, apparatus, and system for clustering and classification will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2687779

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.