System and method for computing a measure of similarity...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

07493322

ABSTRACT:
A measure of similarity between two documents is computed. In computing the measure of similarity, a first list of rated keywords extracted from the first document and a second list of rated keywords extracted from the second document are received. The first and second lists of keywords are used to determine whether the first document forms part of the second document using a first computed percentage indicating what percentage of keyword ratings in the first list also exist in the second list. A second percentage is computed that indicates what percentage of keyword ratings along with a set of their neighboring keyword ratings in the first list that also exist in the second list when the first percentage indicates that the first document is included in the second document. The first percentage is used to specify the measure of similarity when the second percentage is greater than the first percentage.

REFERENCES:
patent: 5465353 (1995-11-01), Hull et al.
patent: 5486686 (1996-01-01), Zdybel, Jr. et al.
patent: 5629980 (1997-05-01), Stefik et al.
patent: 5634012 (1997-05-01), Stefik et al.
patent: 5638443 (1997-06-01), Stefik et al.
patent: 5715403 (1998-02-01), Stefik
patent: 5748805 (1998-05-01), Withgott et al.
patent: 5781914 (1998-07-01), Stork et al.
patent: 5802515 (1998-09-01), Adar et al.
patent: 5848409 (1998-12-01), Ahn
patent: 5848413 (1998-12-01), Wolff
patent: 5893908 (1999-04-01), Cullen et al.
patent: 5940624 (1999-08-01), Kadashevich et al.
patent: 5987457 (1999-11-01), Ballard et al.
patent: 6041323 (2000-03-01), Kubota
patent: 6167397 (2000-12-01), Jacobson et al.
patent: 6178417 (2001-01-01), Syeda-Mahmood
patent: 6236971 (2001-05-01), Stefik et al.
patent: 6348970 (2002-02-01), Marx
patent: 6363381 (2002-03-01), Lee et al.
patent: 6396951 (2002-05-01), Grefenstette
patent: 6397213 (2002-05-01), Cullen et al.
patent: 2002/0107735 (2002-08-01), Henkin et al.
patent: 2002/0118379 (2002-08-01), Chakraborty
patent: 2002/0156763 (2002-10-01), Giovanni
patent: 2003/0004716 (2003-01-01), Haigh et al.
patent: 2003/0149686 (2003-08-01), Drissi et al.
patent: 2003/0172066 (2003-09-01), Cooper
patent: 2004/0015784 (2004-01-01), Chidlovskii
patent: 2004/0133560 (2004-07-01), Simske
patent: 2004/0254920 (2004-12-01), Brill et al.
patent: 2005/0234898 (2005-10-01), Drissi et al.
patent: 2006/0116996 (2006-06-01), Brill et al.
patent: 0 725 353 (1996-08-01), None
patent: 1 168202 (2002-01-01), None
patent: 2003 108579 (2003-04-01), None
Taher et al. “Evaluating Strategies for Similarity Search on the Web,” May 7-11, 2002, ACM, pp. 1-23.
U.S. Appl. No. 09/361,496, entitled “System and Method of Automatic Wrapper Grammar Generation”, filed Jul. 26, 1999.
E.Kuruoglu & V.Tan, “Document Image Retrieval without OCRing Using a Video Scanning System”, Proc ACM International Workshop on Multimedia Information Retrieval Oct. 30, 2000-Nov. 4, 2000, Los Angeles California.
European Search Report EP 04 02 4558 dated Apr. 6, 2006.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for computing a measure of similarity... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for computing a measure of similarity..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for computing a measure of similarity... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4082340

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.