Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
1998-04-14
2001-03-06
Isen, Forester W. (Department: 2747)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C707S793000
Reexamination Certificate
active
06199034
ABSTRACT:
MICROFICHE APPENDICES
Appendix A, entitled “Theme Parser Code” contains five microfiche with a total number of two hundred and eighty two (282) frames.
Appendix B, entitled “Code Heading” contains two microfiche with a total number of eighty five (85) frames.
Appendix C, entitled “Theme Vector Code” contains one microfiche with a total number of sixty three (63) frames.
COPYRIGHT NOTICE
Appendices A, B, C, contain material which is subject to copyright protection. The documents “Chaos Processor for Text”, “Analysis Documentation”, and “Creating a Virtual Bookshelf” also contain material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by any one of the Appendices A, B, C, as it appears in the United States Patent and Trademark patent files or records, but otherwise reserves all copyright rights whatsoever.
FIELD OF THE INVENTION
The present invention relates to the field of computational linguistics, and more particularly to determining and classifying theme for an input discourse.
BACKGROUND OF THE INVENTION
Discourse from a general standpoint is the capacity of orderly thought or procedure. In conversation, it is the orderly flow of ideas wherein certain topics are introduced and grouped in organized manners, and additional information expands on these topics by saying something about them. One example of a discourse is a book wherein the lowest level of a topic exists in sentences. Generally, a sentence is a grammatically self-contained language unit consisting of a word or a syntactically related group of words that express an assertion, a question, a command, a wish, or an exclamation; that in writing usually begins with a capital letter and concludes with appropriate ending punctuation; and that in speaking is phonetically distinguished by various patterns of stress, pitch and pause. Each sentence in a discourse can be said to have a topic, explicitly stated or implied, and a focus, or something that is being said about the topic.
In general, theme identifies which topic is really being discussed and what is being said about that topic. To understand the thematic information in a sentence, an analysis method is needed that is able to experience all of the subtle nuances that a writer conveys to a reader in less tangible ways. The human mind does not understand information by analyzing the grammatical content of a sentence. Many sentences are identical in grammatical context but are very different because of the specific selection of words and what additional facets of understanding the words add to the understanding of the sentence. The difference does not just influence the topics by introducing another different idea, but also influences the level of importance that each word has in the sentence by indicating new, extra-grammatical, thematic contexts. Therefore, prior art systems that determine the importance of theme by counting the number of times words appear in a document do not accurately determine theme.
SUMMARY OF THE INVENTION
A theme vector processor determines themes in an input discourse. The theme vector processor receives thematic tags for words and phrases in the input discourse, wherein the thematic tags indicate applicability of thematic constructions that define content of discourse. In addition, theme terms are identified based on the content carrying words of the input discourse. The theme vector processor identifies themes of the input discourse, including identifying the relative importance of the themes in the input discourse, based on the thematic tags and the theme terms. Specifically, the theme vector processor generates a theme strength for the theme terms. The theme strength indicates relative thematic importance for the theme terms in the input discourse.
In one embodiment, the theme vector processor generates theme concepts for each theme term in the input discourse through use of a knowledge catalog. The knowledge catalog includes independent and parallel static ontologies arranged in a hierarchical structure. The static ontologies contain knowledge concepts and present a world view of knowledge. The theme vector processor utilizes the static ontologies to generate a theme concept for a theme term by extracting a knowledge concept from a higher level node in the hierarchical structure of the static ontologies.
Other features and advantages of the present invention will be apparent from the accompanying drawings, and from the detailed description that follows below.
REFERENCES:
patent: 4864502 (1989-09-01), Kucera et al.
patent: 4887212 (1989-12-01), Zamora et al.
patent: 4914590 (1990-04-01), Loatman et al.
patent: 5056021 (1991-10-01), Ausborn
patent: 5083268 (1992-01-01), Hemphill et al.
patent: 5182708 (1993-01-01), Ejiri
patent: 5257186 (1993-10-01), Ukita et al.
patent: 5371807 (1994-12-01), Register et al.
patent: 5383120 (1995-01-01), Zernik
patent: 5384703 (1995-01-01), Withgott et al.
patent: 5386556 (1995-01-01), Hedin et al.
patent: 5424947 (1995-06-01), Nagao et al.
patent: 5442780 (1995-08-01), Takanashi et al.
patent: 5475588 (1995-12-01), Schabes et al.
patent: 5497319 (1996-03-01), Chong et al.
patent: 5528491 (1996-06-01), Kuno et al.
patent: 5555169 (1996-09-01), Namba et al.
patent: 5689716 (1997-11-01), Chen
patent: 5694523 (1997-12-01), Wical
patent: 5708822 (1998-01-01), Wical
patent: 5768580 (1998-06-01), Wical
U.S. patent application Ser. No.: 08/455,484, Appendix E entitled: Chaos Processor for Text.
U.S. patent application Ser. No.: 08/455,484, Appendix F entitled: Analysis Documentation..
U.S. patent application Ser. No.: 08/455,484, Appendix G entitled: Oracle ConText Linguistics Toolkit, Guide and Reference.
U.S. patent application Ser. No.: 08/455,484, Appendix H entitled: Creating a Virtual Bookshelf.
Edouard Patrick N.
Fliesler Dubb Meyer & Lovejoy LLP
Isen Forester W.
Oracle Corporation
LandOfFree
Methods and apparatus for determining theme for discourse does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods and apparatus for determining theme for discourse, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and apparatus for determining theme for discourse will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2519378