Method and platform for term extraction from large...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and platform for term extraction from large... Method and platform for term extraction from large...

: 2008-07-01
: 2008-07-01
: Mofiz, Apu (Department: 2161)
: Data processing: database and file management or data structures
: Database design
: Data structure types

: C707S793000, C707S793000
: Reexamination Certificate
: active
: 07395256
: ABSTRACT:
A method and platform for statistically extracting terms from large sets of documents is described. An importance vector is determined for each document in the set of documents based on importance values for words in each document. A binary document classification tree is formed by clustering the documents into clusters of similar documents based on the importance vector for each document. An infrastructure is built for the set of documents by generalizing the binary document classification tree. The document clusters are determined by dividing the generalized tree of the infrastructure into two parts and cutting away the upper part. Statistically significant individual key words are extracted from the clusters of similar documents. Key words are treated as seeds and terms are extracted by starting from the seeds and extending to their left or right contexts.

REFERENCES:
patent: 5287278 (1994-02-01), Rau
patent: 5423032 (1995-06-01), Byrd et al.
patent: 5463773 (1995-10-01), Sakakibara et al.
patent: 5642518 (1997-06-01), Kiyama et al.
patent: 5799268 (1998-08-01), Boguraev
patent: 5926811 (1999-07-01), Miller et al.
patent: 6137911 (2000-10-01), Zhilyaev
patent: 6446061 (2002-09-01), Doerre et al.
patent: 2004/0117448 (2004-06-01), Newman et al.
patent: 1 304 627 (2003-04-01), None
International Preliminary Report on Patentability (PCT Article 36 and Rule 70) May 16, 2005.
T. Strzalkowski, “Natural Language Information Retrieval”,Information Processing and Management, vol. 31(3), pp. 397-417, 1995.
K. Church et al., “Word Association Norms, Mutual Information and Lexicography”,In proceedings of ACL, pp. 76-83, 1989.
T. Dunning, “Accurate Methods for the Statistics of Surprise and Coincidence”,Computational Linguistics, vol. 19(1), pp. 61-74, 1993.
L.F. Chien et al., “Internet-based Chinese Text Corpus Classification and Domain-Specific Keyterm Extraction”,Proceedings of Workshop on Computational Technology, pp. 71-75, 1998.
H. Schutze, “The Hypertext Concordance: A Better Back-of-the-Book Index”,Proceedings of Workshop on Computational Technology, pp. 101-104, 1998.
C. Jacquemin, “FASTR: A Unification-Based Front End to Automatic Indexing”,Proceedings of RIAO, pp. 34-47, 1994.
D. Bourigaut, “An Endogeneous Corpus-Based Method for Structural Noun Phrase Disambiguation”,Proceedings of EACL, pp. 187-213, 1993.
G. Grefenstette, “Explorations in Automatic Thesaurus Discovery”,Kluwer Academic Press, 1994, 35 page.
J. Pustejovsky, “Lexical Semantic Techniques for Corpus Analysis”,Association for Computational Linguistics, vol. 19(2), pp. 331-358, 1993.
K. Frantzi et al., “Automatic recognition of multi-word terms: the C-value/NC-valuemethod”,Journal of Digital Library, vol. 3, pp. 115-130, 2000.

Affiliated with

Ji Donghong

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Nie Yu

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Yang Lingpeng

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Agency for Science Technology and Research

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mofiz Apu

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Padmanabhan Kavita

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Sughrue & Mion, PLLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and platform for term extraction from large... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and platform for term extraction from large..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and platform for term extraction from large... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2753533

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure