Method for generating candidate word strings in speech...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S251000, C704S252000, C704S257000

Reexamination Certificate

active

06760702

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the field of speech recognition and, more particularly, to a node-based method for generating candidate word strings in speech recognition without expanding the strings.
2. Description of Related Art
To achieve a higher recognition accuracy, the output of a speech recognition module is not only a single recognition result, instead, a plurality of possible results are provided so that a subsequent process may select a best one therefrom in current speech recognition system.
Therefore, a speech recognition module must provide many possible results to the subsequent process. Accordingly, the generation of a plurality of candidate word strings from a speech signal for the subsequent process is a major concern in developing the speech recognition system.
U.S. Pat. No. 5,241,619 discloses a method for searching candidate word strings in which N candidate word strings are maintained during a matching process of speech signals and words. The N candidate word strings are obtained after the matching process. In such a method, the N candidate word strings maintained previously have to be expanded and modified for each time frame. If there are M words in vocabulary, as shown in
FIG. 6
, there will be M new candidate word strings generated when a candidate word string is expanded. The best N candidate word strings are selected from all expanded candidate word strings for being used as a basis for the expansion in the next time frame. In this manner, a large memory space is required to store expanded candidate word strings, and a sorting must be performed for each time frame to maintain N possible candidate word strings.
Another approach for candidate word strings search is implemented in a two-stage design. In the first stage, a modified Viterbi algorithm is employed to generate a word lattice from input speech signal. In the second stage, a stack search is used to generate the candidate word string by tracing back the word lattice generated in the first stage. The detailed description of such method can be found in U.S. Pat. No. 5,805,772, entitled “Systems, Methods and Architecture of Manufacture for Performing High Resolution N-best String Hypothesization” and “A Tree-trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition” by F. K. Soong and E. F. Huang, ICASSP'91, pp. 705-708, 1991, which are hereby incorporated by reference into this patent application. As known, this method must continuously perform stack operations, such as push and pop, for expanding word strings in order to obtain possible candidate word strings. This method inevitably spends much time on the expanding of candidate word strings.
Still another method for searching candidate word strings is implemented in a two-stage design similar to above method. In the first stage, 408 Mandarin syllables are used as recognition units for generating syllable lattice. In the second stage, N-best syllables are selected for back-tracing operation with the use of the stack search in order to generate a plurality of candidate word strings. A detailed description of such a method can be found in “An Efficient Algorithm for Syllable Hypothesization in Continuous Mandarin Speech Recognition” by E. F. Huang and H. C. Wang, IEEE transactions on speech and audio processing, pp. 446-449, 1994, which is incorporated herein by reference.
A further method for searching candidate word strings is also implemented in a two-stage design, in which a word graph algorithm is employed to generate the word graph and a best word string in the first stage. The detailed description can be found in “A Word Graph Algorithm for Large Vocabulary Continuous Speech Recognition” by S. Ortmanns, H. Ney, and X. Aubert, Computer Speech and Language, pp. 43-72, 1997, which is incorporated herein by reference. In the second stage, the searching for candidate word strings is performed on the nodes of the best word string. The output is recorded in a tree structure for saving memory space. A detailed description of this method can be found in U.S. Pat. No. 5,987,409, entitled “Method of and Apparatus for Deriving a Plurality of Sequences of Words From a Speech Signal”, which is incorporated herein by reference.
Basically, the above methods perform searching operations based on the expansion of word strings. Such operation requires a large memory space to store word strings, and spends a lot of time for to expand word strings. Therefore, it is desired to make an improvement on the search of candidate word strings.
SUMMARY OF THE INVENTION
The object of the present invention is to provide a method for quickly generating candidate word strings without expanding the word strings. This method comprises the steps of: (A) determining an associated maximum string score for each node; (B) sorting all nodes by their associated maximum string scores to group the nodes with the same string score into the same node set; and, (C) selecting the node sets with relative high string scores generated in step (B), so as to connect the nodes by their starting time frame and ending time frame, thereby generating the candidate word strings.
Other objects, advantages, and novel features of the invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings.


REFERENCES:
patent: 5241619 (1993-08-01), Schwartz et al.
patent: 5719997 (1998-02-01), Brown et al.
patent: 5805772 (1998-09-01), Chou et al.
patent: 5884259 (1999-03-01), Bahl et al.
patent: 5987409 (1999-11-01), Tran et al.
patent: 6453315 (2002-09-01), Weissman et al.
Frank K. Soong and Eng-Fong Huang; “A Tree—Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition”; 1999 IEEE; pp. 705-708.
Eng-Fong Huang and Hsiao—Chuan Wang; “An Efficient Algorithm for Syllable Hypothesization in Continuous Mandarin Speech Recognition”; 1994 IEEE; pp. 446-449.
Stefan Ortmanns, Hermann Ney and Xavier Aubert; “A Word Graph Algorithm for Large Vocabulary Continous Speech Recognition”; 1997; Computer Speech and Language; pp. 43-72.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for generating candidate word strings in speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for generating candidate word strings in speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for generating candidate word strings in speech... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3229380

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.