Method and apparatus for generating deterministic...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S255000, C706S017000, C706S020000

Reexamination Certificate

active

06266634

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of Invention
This invention relates generally to automatic speech recognition. In particular, this invention relates to determinizing non-deterministic weighted finite-state automata used in automatic speech recognition. More specifically, this invention relates to generating approximate weighted finite-state automata from non-deterministic weighted finite-state automata.
2. Description of Related Art
One of the central problems in automatic speech recognition (ASR) is to reduce the sizes of the lattices that represent hypotheses of spoken utterances. As shown in
FIG. 1
, automatic speech recognition can be viewed as a processing pipeline or cascade. In each step of the processing cascade, one or two lattices are input and composed to produce an output lattice. In automatic speech recognition, the term “lattice” denotes a directed and labeled graph, which is possibly weighted. In each lattice, there is typically a designated start node “s” and a designated final node “t”. Each possible pathway through the lattice from the start node s to the final node t induces a hypothesis based on the arc labels between each pair of nodes in the path. For example, in a word lattice, the arc labels are words and the various paths between the start node s and the final node t form sentences.
As shown in
FIG. 1
, an automatic speech recognition system
100
includes a signal processing system
110
, an acoustic model lattice
120
, a phonetic recognition subsystem
130
, a lexicon lattice
140
, a word recognition subsystem
150
, a grammar lattice
160
, and a task recognition subsystem
170
. In operation, uttered speech is input via a microphone (not shown) which converts the sound waves of the uttered speech into an electronic speech signal. The electronic speech signal is input to the signal processing system on a speech signal input line
105
. The signal processing subsystem
110
digitizes the electronic speech signal to generate a feature vector lattice
115
. The feature vector lattice
115
is a lattice of acoustic feature vectors. The feature vector lattice
115
is input along with the acoustic model lattice
120
to the phonetic recognition subsystem
130
. The acoustic model lattice
120
represents a set of acoustic models and is applied to transform the feature vector lattice
115
into a phone lattice. Each node of the phone lattice represents a spoken sound, such as, for example, the vowel /e/ in “bed”.
The phone lattice
135
is input along with the lexicon lattice
140
into the word recognition subsystem
150
. The lexicon lattice
140
describes different pronunciations of various words and transforms the phone lattice
135
into a word lattice
155
. The word lattice
155
is then input, along with the grammar lattice
160
, into the task recognition subsystem
170
. The grammar lattice
160
represents task-specific information and is used to extract the most likely sequence of words from the word lattice
155
. Thus, the task recognition subsystem
170
uses the grammar lattice
160
to extract the most likely sentence from the word lattice
155
. The most likely sequence is output as the recognized text
175
.
In particular, one conventional method of implementing automatic speech recognition forms each of the acoustic model lattice
120
, the lexicon lattice
140
and the grammar lattice
160
as a finite-state transducer. Thus, each of the phonetic recognition subsystem
130
, the word recognition subsystem
150
, and the task recognition
170
performs a generalized composition operation between its input finite-state transducers. In addition, the signal processing subsystem
110
outputs the features vector lattice
115
as a finite-state transducer.
FIGS. 2A-2C
show more generalized forms of the automatic speech recognition transduction cascade. In particular, in automatic speech recognition, the finite-state transducers are weighted finite-state automata, which represent and compute membership information on weighted languages such as, for example, sets of words or sentences. In particular, the size of a weighted finite-state automata is critical to automatic speech recognition. Conventionally, analogs to classical automata determinization and minimization are used to reduce weighted finite-state automata size. However, not all weighted finite-state automata can be determinized. Furthermore, for those that can be determinized, the equivalent deterministic weighted finite-state automaton can be exponentially larger. The nature of automatic speech recognition, however, appears to constrain the weighted finite-state automata so that determinization is not only possible, but often produces a deterministic weighted finite-state automaton that is smaller than the equivalent non-deterministic weighted finite-state automaton.
One method for determinizing and minimizing non-deterministic weighted finite-state automata is described in “Finite-State Transducers in Language and Speech Processing”, M. Mohri,
Computational Linguistics
23, 1997 (Mohri), herein incorporated by reference in its entirety. This method for determinizing and minimizing non-deterministic weighted finite-state automata generates significant average size reductions for the equivalent deterministic weighted finite-state automata. However, this method does not reduce all non-deterministic weighted finite-state automata, presenting a problem for real-time automatic speech recognition.
SUMMARY OF THE INVENTION
This invention provides a method and apparatus that generates deterministic weighted finite-state automata having improved size reductions.
This invention further provides a method and apparatus which is capable of generating reduced deterministic weighted finite-state automata which cannot be reduced using current techniques.
This invention also provides a method and apparatus which generate approximate weighted finite-state automata.
This invention additionally provides a method and apparatus for performing automatic speech recognition using the approximate weighted finite-state automata.
Approximate weighted finite-state automata are devices that represent and compute membership information on weighted languages, for example, sets of words or sentences. Approximate weighted finite-state automata are approximate in that the language represented by the approximate weighted finite-state automata can differ from the language represented by the weighted finite-state automata from which it is generated. Approximate weighted finite-state automata can be used to represent approximations to various automatic speech recognition models. The approximate weighted finite-state automata require less space, i.e., computer memory, than the corresponding finite-state automata from which they are generated. At the same time, the approximations used to generate the approximate weighted finite-state automata do not degrade the accuracy or precision of the automatic speech recognition process.
It should also be appreciated that other paradigms for the automatic speech recognition process benefit from the size reduction due to finite-state automata determinization and minimization. These other automatic speech recognition paradigms include using hidden Markov models and n-gram language models to implement the various stages of the automatic speech recognition process. Thus, the approximate weighted finite-state automata will also be applicable to these models. In short, any bounded memory stochastic representation can be implemented using a weighted finite-state automaton, which can in turn be approximated with the approximate weighted finite-state automaton of this invention.
These and other features and advantages of this invention are described in or are apparent from the following detailed description of the preferred embodiments.


REFERENCES:
patent: 4718092 (1999-01-01), Klovstad
patent: 5412567 (1995-05-01), Karttunen
patent: 5495409 (1996-02-01), Kanno
patent: 5606690 (1997-02-01), Hunter et al.
patent: 5787396 (1998-07-01), Komori et al.
patent: 5794198 (1998-08-01),

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for generating deterministic... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for generating deterministic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for generating deterministic... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2512302

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.