Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2005-09-20
2005-09-20
Coby, Frantz (Department: 2161)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06947933
ABSTRACT:
A technique for determining when documents stored in digital format in a data processing system are similar. A method compares a sparse representation of two or more documents by breaking the documents into “chunks” of data of predefined sizes. Selected subsets of the chunks are determined as being representative of data in the documents and coefficients are developed to represent such chunks. Coefficients are then combined into coefficient clusters containing coefficients that are similar according to a predetermined similarity metric. The degree of similarity between documents is then evaluated by counting clusters into which chunks of similar documents fall.
REFERENCES:
patent: 5926812 (1999-07-01), Hilsenrath et al.
patent: 5940830 (1999-08-01), Ochitani
patent: 6119124 (2000-09-01), Broder et al.
patent: 6298174 (2001-10-01), Lantrip et al.
patent: 6311176 (2001-10-01), Steiner
patent: 6418431 (2002-07-01), Mahajan et al.
patent: 6633882 (2003-10-01), Fayyad et al.
patent: 6778995 (2004-08-01), Gallivan
patent: 6804670 (2004-10-01), Kreulen et al.
patent: 2002/0178271 (2002-11-01), Graham et al.
Dittenbach, M., et al., “Uncovering hierarchical structure in data using the growing hierarchical self-organizing map”,Neurocomputing, 48:199-216 (2002).
Coby Frantz
Filipczyk Marc
Hamilton Brook Smith & Reynolds P.C.
Verdasys, Inc.
LandOfFree
Identifying similarities within large collections of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Identifying similarities within large collections of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Identifying similarities within large collections of... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3401007