System and method for context-dependent probabilistic...

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000

Reexamination Certificate

active

06925433

ABSTRACT:
A computer-implemented system and method is disclosed for retrieving documents using context-dependant probabilistic modeling of words and documents. The present invention uses multiple overlapping vectors to represent each document. Each vector is centered on each of the words in the document, and consists of the local environment, i.e., the words that occur close to this word. The vectors are used to build probability models that are used for predictions. In one aspect of the invention a method of context-dependant probabilistic modeling of documents is provided wherein the text of one or more documents are input into the system, each document including human readable words. Context windows are then created around each word in each document. A statistical evaluation of the characteristics of each window is then generated, where the results of the statistical evaluation are not a function of the order of the appearance of words within each window. The statistical evaluation includes the counting of the occurrences of particular words and particular documents and the tabulation of the totals of the counts. The results of the statistical evaluation for each window are then combined. These results are then used for retrieving a document, for extracting features from a document, or for finding a word within a document based on its resulting statistics.

REFERENCES:
patent: 4905287 (1990-02-01), Segawa
patent: 5023912 (1991-06-01), Segawa
patent: 5325298 (1994-06-01), Gallant
patent: 5434777 (1995-07-01), Luciw
patent: 5488725 (1996-01-01), Turtle et al.
patent: 5619709 (1997-04-01), Caid et al.
patent: 5675819 (1997-10-01), Schuetze
patent: 5713016 (1998-01-01), Hill
patent: 5778397 (1998-07-01), Kupiec et al.
patent: 5809496 (1998-09-01), Byrd, Jr. et al.
patent: 5913185 (1999-06-01), Martino et al.
patent: 5918240 (1999-06-01), Kupiec et al.
patent: 6070133 (2000-05-01), Brewster et al.
patent: 6192360 (2001-02-01), Dumais et al.
patent: 6256629 (2001-07-01), Sproat et al.
patent: 6741981 (2004-05-01), McGreevy
patent: 6772120 (2004-08-01), Moreno et al.
patent: 0241183 (1987-10-01), None
patent: 0539749 (1993-05-01), None
patent: 11338883 (1999-12-01), None
patent: WO 97/46998 (1997-12-01), None
Shavlik et al., “An Instructable, adaptive interface for discovering and monitoring information on the World-Wide Web”, Proceedings of the 4th international conference on Intelligent user interfaces, Los Angeles, California, 1998, pp: 157-160.
Lewis et al., “A sequential algorithm for training text classifiers”, Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, Dublin, Ireland, 1994, pp: 3-12.
Shavlik et al., “Intelligent Agents for Web-based Tasks: An Advice-Taking Approach”, AAAI/ICML Workshop on Learning for Text Categorization, 1998.
Pederson, “A simple approach to building ensembles of Naive Bayesian classifiers for word sense disambiguation”, Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, Seattle, Washington, 2000.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for context-dependent probabilistic... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for context-dependent probabilistic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for context-dependent probabilistic... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3450721

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.