Method and apparatus for analyzing affect and emotion in text

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C704S009000, C715S252000

Reexamination Certificate

active

06622140

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to computer text documents and, more particularly, to analyzing affect and emotion in the documents.
2. Description of the Prior Art
There are methods and apparatuses that model emotion and personality, synthesize emotional speech, and monitor physical manifestations of emotion (including changes in brain signals, facial expression, and motion). However, there is no prior art that analyzes and measures emotion and affect in text documents.
G. Collier has analyzed emotional expression. Collier,
G., Emotional Expression
, Lawrence Erlbaum & Associates, In., 1985. Collier focuses on the use of grammatical categories, such as the ratio of the number of verbs and adjectives, the use of past tense and negation, and changes in grammatical complexity, to assess a speaker's emotional state. While Collier briefly discusses verbal immediacy, most of the work is on the use of adjectives to describe an emotional state. Almost all work at the intersection of emotion and text is focused on defining emotion words like “fear” and “surprise” and not on analyzing the emotional attitudes expressed in subtle fashion through text.
Text classification methods, like naïve Bayes, measure the probability of a word given that a document belongs to a class (positive, negative, or neutral). These methods do not consider the probability of a word's absence. Also, these methods cannot correctly assign affect to documents that contain a mixture of affect terms (i.e., contain positive and negative affect terms). Moreover, there have been no attempts in the text classification literature to analyze affect.
The naïve Bayes method computes the probability that a document merits a particular class label based on a simple combination of the independent probabilities for each of the words in the document. However, this method will not work well for affect analysis because the expression of emotion in text is more complex. The assumption of independent probabilities required by this method fails to properly account for the way in which positive and negative affects combine, and so will not be effective in classifying text documents according to affect.
Some prior art text classification methods count the frequency and rarity of affective terms. However, the likelihood that a document is positive is not well-correlated with just the presence of positive affect terms, but also with the absence of negative affect terms.
For example, applying known text classification methods to the task of finding positive web pages about Barney the purple dinosaur is ineffective. Most such pages were written by Barney-bashers in strong, negative tones. Presumably, somebody who hates Barney would want to see the negative pages, and somebody who loves Barney would want to see the positive pages. But a search for “love Barney purple dinosaur” would yield overwhelmingly negative pages, because the word “love” does not discriminate between positive and negative pages. Although the word “love” is one of the most common positive affect terms on the positive pages, it is also the most common positive affect term on the negative pages and appears more frequently on the negative pages than on the positive pages. Moreover, the word “love” appears 50% more frequently than the word “hate” on the negative pages. In fact, no positive affect term is effective at distinguishing positive Barney pages from negative pages. The most accurate method of distinguishing positive Barney pages from negative Barney pages is to look for Barney pages that include positive affect terms with a concurrent absence of negative affect terms.
None of the prior art concerns the classification of text documents according to affect. None of the prior art involves methods of analyzing affect in text, nor the identification of affect associated with each of the named entities mentioned in a text document. None of the prior art is capable of analyzing the subtle stylistic cues and influence that word choice applies to the emotional tone of a document.
SUMMARY OF THE INVENTION
In order to overcome the limitations of the prior art, I have developed a method and apparatus for analyzing affect and emotion in text documents.
Affect and emotion manifest themselves in text documents through subtle stylistic cues, such as changes in the choice of synonyms. For example, “John crushed the competition” and “John won” communicate the same information, but convey a different attitude about John's role.
The present invention analyzes affect and emotion in text, reporting a valence (positive, negative, or neutral) and intensity (magnitude) for the text's overall emotion and for the emotion associated with each named entity. The system can be used to classify news articles as good news or bad news, classify web pages on a topic as positive or negative, and classify customer communications into complaints and compliments. Other applications include the analysis of financial news for short-term prediction of the impact of the news on stock prices.
An embodiment of the present invention analyzes affect by computing a weighted sum of the scores for positive and negative affect terms (words and phrases), where the scores for negative affect terms are subtracted from the scores for positive affect terms. Possible scoring methods include the frequency of occurrence of an affect term or the frequency multiplied by a term intensity or magnitude (e.g., “maim” and “kill” are more strongly negative than “hurt”). Negation of an affect term can either be ignored or used to invert the contribution from the negated affect terms.
Most affect terms have only a single affect value. However, the affect assigned to some terms may depend on the term's part of speech. For example, the word “hit” is positive as a modifier (“hit movies”), negative as a noun (“took a direct hit”), and neutral (“hit a new 52-week high”) or negative as a verb (“John hit Mary”). Thus, a part-of-speech tagger may be integrated with the affect analyzer. Likewise, the affect assigned to some terms may depend on the term's word sense. For example, the word “leading” has positive affect only when used to indicate prominence, not when used to refer to interline spacing.
Another embodiment of the present invention combines affect analysis with named entity extraction to assign an affect to each named entity mentioned in the document, in addition to assigning an affect value to the entire document. When assigning affect to named entities, the affect is assigned to the nearest named entity that is not “blocked” by other affect terms or named entities (terms between the affect term and the nearest named entity). The idea is that each mention of an affect term primes a positive or negative association in the reader's mind which may influence the reader's attitude to nearby named entities. But the affect decays rapidly, persisting only far enough to contribute to nearby named entities.
In a preferred embodiment, the direction of application of affect (e.g., before or after the named entity) is ignored. In another embodiment, the direction controls whether the affect is inverted for some affect terms. In another embodiment, the sentences are parsed and the affect from verbs is attached to the verb's agents and objects, as appropriate, and likewise from modifiers to modified objects. The intention is to capture the notion that the victim of a bad act is pitied and so gets positive affect. (But, in practice, it seems that proximity or involvement in a bad act tarnishes even the victim of the bad act.)
The ability to analyze emotion in text has many important applications. It can be used to classify news articles as good or bad, web pages as positive or negative, and customer communications (correspondence and telephone calls) as complaints or compliments. For example, a web search engine could be modified to allow the user to search for web pages that are positive or negative on a topic. It can also measure the magnitude of the em

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for analyzing affect and emotion in text does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for analyzing affect and emotion in text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for analyzing affect and emotion in text will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3071471

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.