Elimination of left recursion from context-free grammars

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S257000

Reexamination Certificate

active

06449589

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to language modeling and parsing algorithms. More particularly, the present invention relates to eliminating left recursion from grammars or other similar models.
Accurate speech recognition by computer requires more than just an acoustic model to select the correct word spoken by the user. In other words, if a speech recognizer must choose or determine which word has been spoken, if all words have the same likelihood of being spoken, the speech recognizer will typically perform unsatisfactorily. A language model provides a method or means of specifying which sequences of words in the vocabulary are possible, or at least most likely.
Computer speech recognition is usually implemented using top-down language processing. Top-down language processing begins with the largest unit of language to be recognized, such as a sentence, and processes it by analyzing it into smaller units, such as phrases, which in turn, are analyzed into yet smaller units, such as words.
One common technique of classifying is to use a formal grammar. The formal grammar defines the sequence of words that the application will allow. One particular type of grammar is known as a “context-free grammar” (CFG), which allows complex linguistic patterns to be specified. However, topdown language processing systems that use a context free grammar do not permit “left recursion” within the grammar. “Left recursion” is present in a CFG when a definition of a category can begin with a smaller phrase of the same category. In the English language, “left recursion” can be illustrated by the following CFG:
S→NP VP [A sentence (S) can consist of a noun phrase (NP) followed by a verb phrase (VP).]
NP→Det N [A noun phrase can consist of a determiner (Det) followed by a noun (N).]
Det→NP's [A determiner can consist of a noun phrase followed by “'s”]
FIG. 1
is a pictorial representation of the rules or rule expressions above. “Left recursion” is present in this partial grammar because the definition of a noun phrase (NP) includes a determiner (DET), the definition of which includes a noun phrase in the left-most position on the right-hand side of the rule expression. Augmented with appropriate additional rule expressions and dictionary entries, this grammar will define such sentences as:
“John sleeps.”
“John's mother sleeps.”
“John's mother's dog sleeps.”
“Left recursion” cannot be directly coded in the grammar and used by a top-down language processing engine. However, it has been known how to transform a CFG having left recursion into Greibach normal form. (Transforming a grammar results in a different grammar that permits the same sequence of words.) Greibach normal form is non-left-recursive. Unfortunately, converting a CFG to Greibach normal form can realize a grammar that is far too large to be used, or sometimes, even completely generated. In other words, the resulting CFG contains or would contain too many rules to define the desired sentences. This situation creates two problems. First, storage capabilities may not exist for storing the complete set of rules of the transformed grammar. Second, processing or traversal of the grammar during speech recognition (or other language processing such as parsing) may take too long.
There thus is a need to improve context-free grammars used by top-down language processing systems such as speech recognizers or parsers. For instance, there is a need to transform a left-recursive context-free grammar into a non-left-recursive grammar without the latter becoming too large.
SUMMARY OF THE INVENTION
A method for transforming a first set of rule expressions forming a first grammar to a second set of rule expressions forming a second grammar includes identifying at least one left-recursive category of the first grammar; and applying a left-corner transform to substantially only the left-recursive rule expressions of the first grammar in forming the second grammar. The method can be executed on a suitable computer wherein instructions are provided on a computer readable medium.
A second broad aspect of the present invention is a method for building a language model by transforming a first set of rule expressions forming a first grammar to a second set of rule expressions forming a second grammar, the method including:
replacing a set of rule expressions of the form,
A→X
1
&bgr;
1
, . . . , A→X
n
&bgr;
n
 with
A→A
-non-left-rec
A
-non-left-rec→
X
1
&bgr;
1
. . . A
-non-left-rec→
X
n
&bgr;
n
where A is a left-recursive category, X
1
. . . X
n
are each any word or non-left-recursive category of the first grammar, &bgr;
1
. . . &bgr;
n
are each a sequence (possibly a null sequence) of words and/or categories of the first grammar, and A-non-left-rec is a newly defined category. The method can be executed on a suitable computer wherein instructions are provided on a computer readable medium.


REFERENCES:
patent: 4984178 (1991-01-01), Hemphill et al.
patent: 5384892 (1995-01-01), Strong
patent: 5475588 (1995-12-01), Schabes et al.
patent: 5907634 (1999-05-01), Brown et al.
patent: 6157912 (2000-12-01), Kneser et al.
“Introduction to Automata theory, Languages, and Computation”, for Addison-Wesley Publishing Company, 1979, pp. 94-99; p. 106.
“Finite-state Approximation of Constraint-based Grammars using Left-corner Grammar Transforms”, by Mark Johnson, presented at COLING-ACL '98, Montreal, Quebec, Canada, Aug. 10-14, 1998, pp. 619-623.
“An Optimal Tabular Parsing Algorithm”, by Mark-Jan Nederhof, presented at 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, Jun. 27-30, 1994, pp. 117-124.
“Recursion”, Nuance Speech Recognition System Version 5, Developer's Manual. for Nuance Communications, 1996, pp. 10-11.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Elimination of left recursion from context-free grammars does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Elimination of left recursion from context-free grammars, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Elimination of left recursion from context-free grammars will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2865144

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.