Grammar fragment acquisition using syntactic and semantic...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Grammar fragment acquisition using syntactic and semantic... Grammar fragment acquisition using syntactic and semantic...

: 1998-12-21
: 2001-01-09
: {haeck over (S)}mits, T{overscore (a)}livaldis I. (Department: 2741)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S251000, C704S252000, C704S231000
: Reexamination Certificate
: active
: 06173261
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of Invention
This invention relates to the automated acquisition of grammar fragments for recognizing and understanding spoken language.
2. Description of Related Art
In speech-understanding systems, the language models for recognition and understanding are traditionally designed separately. Furthermore, while there is a large amount of literature on automatically learning language models for recognition, most understanding models are designed manually and involve a significant amount of expertise in development.
In general, a spoken language understanding task can have a very complex semantic representation. A useful example is a call-routing scenario, where the machine action transfers a caller to a person or machine that can address and solve problems based on the user's response to an open-ended prompt, such as “How may I help you?” These spoken language understanding tasks associated with call-routing are addressed in U.S. Pat. No. 5,794,193, “Automated Phrase Generation”, and U.S. Pat. No. 5,675,707 “Automated Call Routing System”, both filed on Sep. 15, 1995, which are incorporated herein by reference in their entireties. Furthermore, such methods can be embedded within a more complex task, as disclosed in U.S. patent application Ser. No. 08/943,944, filed Oct. 3, 1997, which is also hereby incorporated by reference in its entirety.
While there is a vast amount of literature on syntactic structure and parsing, much of that work involves a complete analysis of a sentence. It is well known that most natural language utterances cannot be completely analyzed by these methods due to lack of coverage. Thus, many approaches use grammar fragments in order to provide a localized analysis on portions of the utterance where possible, and to treat the remainder of the utterance as background. Typically, these grammar fragments are defined manually and involve a large amount of expertise.
In an attempt to solve some of these problems, U.S. patent application Ser. Nos. 08/960,289 and 08/960,291, both filed Oct. 29, 1997 and hereby incorporated by reference in their entireties, disclose how to advantageously and automatically acquire sequences of words, or “superwords”, and exploit them for both recognition and understanding. This is advantageous because longer units (e.g., area codes) are both easier to recognize and have sharper semantics.
While superwords (or phrases) have been shown to be very useful, many acquired phrases are merely mild variations of each other (e.g., “charge this call to” and “bill this to”). For example, U.S. patent application Ser. No. 08/893,888, filed Jul. 8, 1997 and incorporated herein by reference in its entirety, discloses how to automatically cluster such phrases by combing phrases with similar wordings and semantic associations. These meaningful phrase clusters were then represented as grammar fragments via traditional finite state machines. This clustering of phrases is advantageous for two reasons: First, statistics of similar phrases can be pooled, thereby providing more robust estimation; and second, they provide robustness to non-salient recognition errors, such as “dialed a wrong number” versus “dialed the wrong number”.
However, in order to utilize these grammar fragments in language models for both speech recognition and understanding, they must be both syntactically and semantically coherent. To achieve this goal, an enhanced clustering mechanism exploiting both syntactic and semantic associations of phrases is required.
SUMMARY OF THE INVENTION
A method and apparatus for clustering phrases into grammar fragments is provided. The method and apparatus exploits succeeding words, preceding words and semantics associated to each utterance, in order to generate grammar fragments consisting of similar phrases. Distances between phrases may be calculated based on the distribution of preceding and succeeding words and call-types.
In at least one embodiment, the goal may be to generate a collection of grammar fragments each representing a set of syntactically and semantically similar phrases. First, phrases observed in the training set may be selected as candidate phrases. Each candidate phrase may have three associated probability distributions: of succeeding contexts, of preceding contexts, and of associated semantic actions. The similarity between candidate phrases may be measured by applying the Kullback-Leibler distance to these three probability distributions. Candidate phrases, which are close in all three distances, may then be clustered into a grammar fragment. Salient sequences of these fragments may then be automatically acquired, which are then exploited by a spoken language understanding module to determine a call classification.

REFERENCES:
patent: 4903305 (1990-02-01), Gillick et al.
patent: 5099425 (1992-03-01), Kanno et al.
patent: 5457768 (1995-10-01), Tsuboi et al.
patent: 5675707 (1997-10-01), Gorin et al.
patent: 5794193 (1998-08-01), Gorin
patent: 5839106 (1998-11-01), Bellegarda
patent: 5860063 (1999-01-01), Gorin et al.
patent: 6023673 (2000-02-01), Bakis et al.
Weinstein et al, “Sequential Algorithms Based on Kullback-Liebler Information Measure and their Application to FIR System Identification”.*

Affiliated with

Arai Kazuhiro

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Gorin Allen Louis

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Riccardi Giuseppe

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Wright Jeremy Huntley

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

AT&T Corp

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Nolan Daniel A.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

{haeck over (S)}mits T{overscore (a)}livaldis I.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Grammar fragment acquisition using syntactic and semantic... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Grammar fragment acquisition using syntactic and semantic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Grammar fragment acquisition using syntactic and semantic... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2480236

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure