Creating an electronic dictionary using source dictionary...

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S010000

Reexamination Certificate

active

06651220

ABSTRACT:

TECHNICAL FIELD
The present invention relates to the field of natural language processing (“NLP”), and more particularly, to a method and system for organizing and retrieving information from an electronic dictionary.
BACKGROUND OF THE INVENTION
Natural Language Processing
Computer systems for automatic natural language processing use a variety of subsystems, roughly corresponding to the linguistic fields of morphological, syntactic, and semantic analysis to analyze input text and achieve a level of machine understanding of natural language. Having understood the input text to some level, a computer system can, for example, suggest grammatical and stylistic changes to the input text, answer questions posed in the input text, or effectively store information represented by the input text.
Morphological analysis identifies input words and provides information for each word that a human speaker of the natural language could determine by using a dictionary. Such information might include the syntactic roles that a word can play (e.g., noun or verb) and ways that the word can be modified by adding prefixes or suffixes to generate different, related words. For example, in addition to the word “fish,” the dictionary might also list a variety of words related to, and derived from, the word “fish,” including “fishes,” “fished,” “fishing,” “fisher,” “fisherman,” “fishable,” “fishability,” “fishbowl,” “fisherwoman,” “fishery,” “fishhook,” “fishnet,” and “fishy.”
Syntactic analysis analyzes each input sentence, using, as a starting point, the information provided by the morphological analysis of input words and the set of syntax rules that define the grammar of the language in which the input sentence was written. The following are sample syntax rules:
sentence
=
noun phrase + verb phrase
noun phrase
=
adjective + noun
verb phrase
=
adverb + verb
Syntactic analysis attempts to find an ordered subset of syntax rules that, when applied to the words of the input sentence, combine groups of words into phrases, and then combine phrases into a complete sentence. For example, consider the input sentence: “Big dogs fiercely bite.” Using the three simple rules listed above, syntactic analysis would identify the words “Big” and “dogs” as an adjective and noun, respectively, and apply the second rule to generate the noun phrase “Big dogs.” Syntactic analysis would identify the words “fiercely” and “bite” as an adverb and verb, respectively, and apply the third rule to generate the verb phrase “fiercely bite.” Finally, syntactic analysis would apply the first rule to form a complete sentence from the previously generated noun phrase and verb phrase. An ordered set of rules and the phrases that result from applying them, including a final complete sentence, is called a parse.
Some sentences, however, can have several different parses. A classic example sentence for such multiple parses is: “Time flies like an arrow.” There are at least three possible parses corresponding to three possible meanings of this sentence. In the first parse, “time” is the subject of the sentence, “flies” is the verb, and “like an arrow” is a prepositional phrase modifying the verb “flies.” However, there are at least two unexpected parses as well. In the second parse, “time” is an adjective modifying “flies,” “like” is the verb, and “an arrow” is the object of the verb. This parse corresponds to the meaning that flies of a certain type, “time flies,” like or are attracted to an arrow. In the third parse, “time” is n imperative verb, “flies” is the object, and “like an arrow” is a prepositional phrase modifying “time.” This parse corresponds to a command to time flies as one would time an arrow, perhaps with a stopwatch.
Syntactic analysis is often accomplished by constructing one or more hierarchical trees called syntax parse trees. Each leaf node of the syntax parse tree represents one word of the input sentence. The application of a syntax rule generates an intermediate-level node linked from below to one, two, or occasionally more existing nodes. The existing nodes initially comprise only leaf nodes, but, as syntactic analysis applies syntax rules, the existing nodes comprise both leaf nodes as well as intermediate-level nodes. A single root node of a complete syntax parse tree represents an entire sentence.
Semantic analysis generates a logical form graph that describes the meaning of input text in a deeper way than can be described by a syntax parse tree alone. Semantic analysis first attempts to choose the correct parse, represented by a syntax parse tree, if more than one syntax parse tree was generated by syntactic analysis. The logical form graph corresponding to the correct parse is a first attempt to understand the input text at a level analogous to that achieved by a human speaker of the language.
The logical form graph has nodes and links, but, unlike the syntax parse tree described above, is not hierarchically ordered. The links of the logical form graph are labeled to indicate the relationship between a pair of nodes. For example, semantic analysis may identify a certain noun in a sentence as the deep subject or deep object of a verb. The deep subject of a verb is the doer of the action and the deep object of a verb is the object of the action specified by the verb. The deep subject of an active voice verb may be the syntactic subject of the sentence, and the deep object of an active voice verb may be the syntactic object of the verb. However, the deep subject of a passive voice verb may be expressed in an instrumental clause, and the deep object of a passive voice verb may be the syntactic subject of the sentence. For example, consider the two sentences: (1) “Dogs bite people” and (2) “People are bitten by dogs.” The first sentence has an active voice verb, and the second sentence has a passive voice verb. The syntactic subject of the first sentence is “Dogs” and the syntactic object of the verb “bite” is “people.” By contrast, the syntactic subject of the second sentence is “People” and the verb phrase “are bitten” is modified by the instrumental clause “by dogs.” For both sentences, “dogs” is the deep subject, and “people” is the deep object of the verb or verb phrase of the sentence. Although the syntax parse trees generated by syntactic analysis for sentences 1 and 2, above, will be different, the logical form graphs generated by semantic analysis will be the same, because the underlying meaning of the two sentences is the same.
Further semantic processing after generation of the logical form graph may draw on knowledge databases to relate analyzed text to real world concepts in order to achieve still deeper levels of understanding. An example knowledge base would be an on-line encyclopedia, from which more elaborate definitions and contextual information for particular words can be obtained.
In the following, the three natural language processing subsystems—morphological, syntactic, and semantic—are described in the context of processing the sample input text: “The person whom I met was my friend.”
FIG. 1
is a block diagram illustrating the flow of information between the subsystems of natural language processing. The morphological subsystem
101
receives the input text and outputs an identification of the words and senses for each of the various parts of speech in which each word can be used. The syntactic subsystem
102
receives this information and generates a syntax parse tree by applying syntax rules. The semantic subsystem
103
receives the syntax parse tree and generates a logical form graph.
FIGS. 2-5
display the dictionary information stored on an electronic storage medium that is retrieved for the input words of the sample input text during morphological analysis.
FIG. 2
displays the dictionary entries for the input words “the”
201
and “person”
202
. Entry
201
comprises the key “the”
203
and a list of attribute/value pairs. The first attribute “Adj”
204
has, as its value, the symbols contained within the braces
205
and
206
. These symbols

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Creating an electronic dictionary using source dictionary... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Creating an electronic dictionary using source dictionary..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Creating an electronic dictionary using source dictionary... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3184135

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.