System and method for performing efficient document scoring...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000

Reexamination Certificate

active

07610313

ABSTRACT:
A system and method for providing efficient document scoring of concepts within a document set is described. A frequency of occurrence of at least one concept within a document retrieved from the document set is determined. A concept weight is analyzed reflecting a specificity of meaning for the at least one concept within the document. A structural weight is analyzed reflecting a degree of significance based on structural location within the document for the at least one concept. A corpus weight is analyzed inversely weighing a reference count of occurrences for the at least one concept within the document. A score associated with the at least one concept is evaluated as a function of the frequency, concept weight, structural weight, and corpus weight.

REFERENCES:
patent: 5056021 (1991-10-01), Ausborn
patent: 5477451 (1995-12-01), Brown et al.
patent: 5488725 (1996-01-01), Turtle et al.
patent: 5794236 (1998-08-01), Mehrle
patent: 5799276 (1998-08-01), Komissarchik et al.
patent: 5867799 (1999-02-01), Lang et al.
patent: 6026397 (2000-02-01), Sheppard
patent: 6137911 (2000-10-01), Zhilyaev
patent: 6148102 (2000-11-01), Stolin
patent: 6173275 (2001-01-01), Caid et al.
patent: 6446061 (2002-09-01), Doerre et al.
patent: 6470307 (2002-10-01), Turney
patent: 6502081 (2002-12-01), Wiltshire et al.
patent: 6510406 (2003-01-01), Marchisio
patent: 6519580 (2003-02-01), Johnson et al.
patent: 6560597 (2003-05-01), Dhillon et al.
patent: 6571225 (2003-05-01), Oles et al.
patent: 6598054 (2003-07-01), Schuetze et al.
patent: 6606625 (2003-08-01), Muslea et al.
patent: 6651057 (2003-11-01), Jin et al.
patent: 6675159 (2004-01-01), Lin et al.
patent: 6697998 (2004-02-01), Damerau et al.
patent: 6701305 (2004-03-01), Holt et al.
patent: 6711585 (2004-03-01), Copperman et al.
patent: 6747646 (2004-06-01), Gueziec et al.
patent: 6841321 (2005-01-01), Matsumoto et al.
patent: 6922699 (2005-07-01), Schuetze et al.
patent: 6970881 (2005-11-01), Mohan et al.
patent: 6990238 (2006-01-01), Saffer et al.
patent: 7155668 (2006-12-01), Holland et al.
patent: 7194483 (2007-03-01), Mohan et al.
patent: 7197497 (2007-03-01), Cossock
patent: 7233940 (2007-06-01), Bamberger et al.
patent: 7251637 (2007-07-01), Caid et al.
patent: 7277919 (2007-10-01), Donoho et al.
patent: 7379913 (2008-05-01), Steele et al.
patent: 7383282 (2008-06-01), Whitehead et al.
patent: 7490092 (2009-02-01), Sibley et al.
patent: 2001/0047351 (2001-11-01), Abe
patent: 2002/0078090 (2002-06-01), Hwang et al.
patent: 2002/0082953 (2002-06-01), Batham et al.
patent: 2002/0184193 (2002-12-01), Cohen
patent: 2003/0074368 (2003-04-01), Schuetze et al.
patent: 2003/0110181 (2003-06-01), Schuetze et al.
patent: 2003/0225750 (2003-12-01), Farahat et al.
patent: 2004/0024755 (2004-02-01), Rickard
patent: 2004/0034633 (2004-02-01), Rickard
patent: 2004/0059736 (2004-03-01), Willse et al.
patent: 2004/0205578 (2004-10-01), Wolff et al.
patent: 2004/0215606 (2004-10-01), Cossock
patent: 2005/0004949 (2005-01-01), Trepess et al.
patent: 2005/0216443 (2005-09-01), Morton et al.
patent: 1 024 437 (2000-08-01), None
patent: WO 03/052627 (2003-06-01), None
patent: WO 03/060766 (2003-07-01), None
D. Sullivan, “Document Warehousing And Text Mining, Techniques For Improving Business Operations, Marketing, And Sales,” Chs. 1-3, Wiley Computer Publishing (2001).
Linhui, Jiang, “K-Mean Algorithm: Iterative Partitioning Clustering Algorithm,” http://www.cs.regina.ca/-linhui/K—mean—algorithm.html, (2001) Computer Science Department, University of Regina, Saskatchewan, Canada.
Kanugo et al., “The Analysis Of A Simple K-Means Clustering Algorithm,” pp. 100-109, PROC 16th annual symposium of computational geometry (May 2000).
Pelleg et al., “Accelerating Exact K-Means Algorithms With Geometric Reasoning,” pp. 277-281, CONF on Knowledge Discovery in Data, PROC fifth ACM SIGKDD (1999).
Jain et al., “Data Clustering: A Review,” vol. 31, No. 3, ACM Computing surveys, (Sep. 1999).
Christina Yip Chung et al, “Thematic Mapping—From Unstructured Documents To Taxonomies,” CIKM'02, Nov. 4-9, 2002, pp. 608-610, ACM, McLean, Virgina, USA.
Hiroyuki Kawano, “Overview of Mondou Web Search Engine Using Text Mining And Information Visualizing Technologies,” IEEE, 2001, pp. 234-241.
James Osborn et al “Justice: A Judicial Search Tool Using Intelligent Concept Extraction,” ICAIL-99, 1999, pp. 173-181, ACM.
Chen An et al “Fuzzy Concept Graph And Application In Web Document Clustering,” 2001, pp. 101-106, IEEE.
F. Can., “Incremental Clustering For Dynamic Information Processing,” ACM Transactions On Information Systems, vol. 11, No. 1, pp. 143-164 (Apr. 1993).
Wang et al., “Learning text classifier using the domain concept hierarchy,” Communications, Circuits and Systems and West Sino Expositions, IEEE 2002 International Conference on Jun. 29-Jul. 1, 2002, Piscataway, NJ, USA, IEEE, vol. 2, pp. 1230-1234.
Vance Faber: “Clustering and the Continuous K-Means Algorithm,” Los Alamos Science, The Laboratory, Los Alamos, NM, US, No. 22, Jan. 1, 1994, pp. 138-144.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for performing efficient document scoring... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for performing efficient document scoring..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for performing efficient document scoring... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4060545

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.