Automated system for generating natural language...

Data processing: speech signal processing – linguistics – language – Linguistics – Translation machine

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S009000, C704S007000

Reexamination Certificate

active

06278967

ABSTRACT:

TECHNICAL FIELD
The invention relates to translating automatically one natural language into another natural language, preferably English to Japanese.
BACKGROUND INFORMATION
Various schemes for the machine-based translation of natural language have been proposed. Typically, the system used for translation includes a computer which receives input in one language and performs operations on the received input to supply output in another language. This type of translation has been an inexact one, and the resulting output can require significant editing by a skilled operator. The translation operation performed by known systems generally includes a structural conversion operation. The objective of structural conversion is to transform a given parse tree (i.e., a syntactic structure tree) of the source language sentence to the corresponding tree in the target language. Two types of structural conversion have been tried, grammar-rule-based and template-to-template.
In grammar-rule-based structural conversion, the domain of structural conversion is limited to the domain of grammar rules that have been used to obtain the source-language parse tree (i.e., to a set of subnodes that are immediate daughters of a given node). For example, given
VP=VT01+NP
(a VerbPhrase consists of a SingleObject Transitive Verb and a NounPhrase, in that order)
and
Japanese: 1+2=>2+1
(Reverse the order of VT01 and NP), each source-language parse tree that involves application of the rule is structurally converted in such a wave that the order of the verb and the object is reversed because the verb appears to the right of its object in Japanese. This method is very efficient in that it is easy to find out where the specified conversion applies; it applies exactly at the location where the rule has been used to obtain the source-language parse tree. On the other hand, it can be a weak conversion mechanism in that its domain, as specified above, may be extremely limited, and in that natural language may require conversion rules that straddle over nodes that are not siblings.
In template-to-template structural conversion, structural conversion is specified in terms of input/output (I/O) templates or subtrees. If a given input template matches a given structure tree, that portion of the structure tree that is matched by the template is changed as specified by the corresponding output template. This is a very powerful conversion mechanism, but it can be costly in that it can take a long period of time to find out if a given input template matches any portion of a given structure tree.
SUMMARY OF THE INVENTION
The automated natural language translation system according to the invention has many advantages over known machine-based translators. After the system automatically selects the best possible translation of the input textual information and provides the user with an output (preferably a Japanese language translation of English-language input text), the user can then interface with the system to edit the displayed translation or to obtain alternative translations in an automated fashion. An operator of the automated natural language translation system of the invention can be more productive because the system allows the operator to retain just the portion of the translation that he or she deems acceptable while causing the remaining portion to be retranslated automatically. Since this selective retranslation operation is precisely directed at portions that require retranslation, operators are saved the time and tedium of considering potentially large numbers of incorrect, but highly ranked translations. Furthermore, because the system allows for arbitrary granularity in translation adjustments, more of the final structure of the translation will usually have been generated by the system. The system thus reduces the potential for human (operator) error and saves time in edits that may involve structural, accord, and tense changes. The system efficiently gives operators the full benefit of its extensive and reliable knowledge of grammar and spelling.
The automated natural language translations system's versatile handling of ambiguous sentence boundaries in the source language, and its powerful semantic propagation provide further accuracy and reduced operator editing of translations. Stored statistical information also improves the accuracy of translations by tailoring the preferred translation to the specific user site. The system's idiom handling method is advantageous in that it allows sentences that happen to include the sequence of words making up the idiom, without intending the meaning of the idiom, to be correctly translated. The system is efficient but still has versatile functions such as long distance feature matching. The system's structural balance expert and coordinate structure expert effectively distinguish between intended parses and unintended parses. A capitalization expert effectively obtains correct interpretations of capitalized words in sentences, and a capitalized sequence procedure effectively deals with multiple-word proper names, without completely ignoring common noun interpretations.
In one aspect, the invention is directed to an improvement of the automated natural language translation system, wherein the improvement relates to using an “automatic domain determiner” to aid translation. A domain can include any set of terms and/or patterns of usage attributed to a certain field of usage or to one or more particular people. For example, a domain can include business correspondence, marketing documents, computer-related writings, writings in a technical field such as physics, etc. A dictionary contains certain words which, when used in certain domains, have a translation in the target natural language (e.g., Japanese) which is different than it would be if the word(s) were used in another domain or if it was not used in any domain. A list of domain keywords is also used. The keywords are domain-specific words or terms that are associated with each domain and are used to determine if a particular sentence in the source natural language (or alternatively the whole source document) falls into one of the possible domains. The “automatic domain determiner” feature determines if there are enough keywords in the sentence (or whole document or some portion of the document) to warrant a finding that the sentence (or document) is within a particular domain. If it is within a domain, the translation of the sentence (or the entire document) is carried out using raised probability values of the words that are both listed in the dictionary and appear in the sentence (or entire document) being translated. The determination made by the “automatic domain determiner” is based solely on the source natural language and the keywords. The “automatic domain determiner” aspect of the invention excludes domain-misfit analyses (i.e., those that do not fit properly in a particular domain) from appearing in the resulting tree structure, and it helps to speed up the system's translation time.
In another aspect, the invention is directed to another improvement of the automated natural language translation system, wherein the improvement relates to parsing sentences of the source natural language using grammar rules which can be marked as “almighty,” “POS-Exclusive,” or unmarked. A grammar rule marked as “almighty” is a rule that blocks all other rules by which the same fragment of the sentence could be analyzed. A “POS-Exclusive” (POS means “part(s) of speech) rule blocks any other rule by which the same fragment of the sentence could be analyzed, but the POS-Exclusive rule only does so if the other rule has the same POS. An unmarked grammar rule has no priority over any other rules. Marking grammar rules as either “almighty” or “POS-Exclusive” is a very effective way to prune unnecessary parsing branches from the tree structure that would otherwise be created by the system's translation engine. This marking of grammar rules also makes parsing more efficient

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Automated system for generating natural language... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Automated system for generating natural language..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automated system for generating natural language... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2450339

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.