Task-constrained connected speech recognition of propagation of

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

704243, 704254, 704252, 704255, G10L 500

Patent

active

058192220

DESCRIPTION:

BRIEF SUMMARY
RELATED APPLICATIONS

This application is related to commonly assigned copending applications: Ser. No. 08/094,268 filed Jul. 21, 1993 abandoned in favor of CIP 08/530,157 filed Nov. 21, 1995 and Ser. No. 08/525,730 filed Dec. 19, 1995.


BACKGROUND OF THE INVENTION

1. Field of the Invention
The present invention relates to connected speech recognition and in particular to a method and apparatus for applying grammar constraints to connected speech recognition. The present invention is of particular interest in the area of task-constrained connected word recognition where the task, for example, might be to recognise one of a set of account numbers or product codes.
2. Related Art
It is common in speech recognition processing to input speech data, typically in digital form, to a so-called front-end processor, which derives from the stream of input speech data a more compact, perceptually significant set of data referred to as a front-end feature set or vector. For example, speech is typically input via a microphone, sampled, digitised, segmented into frames of length 10-20 ms (e.g. sampled at 8 kHz) and, for each frame, a set of coefficients is calculated. In speech recognition, the speaker is normally assumed to be speaking one of a set of words or phrases. A stored representation of the word or phrase, known as a template or model, comprises a reference feature matrix of that word as previously derived from, in the case of speaker independent recognition, multiple speakers. The input feature vector is matched with the model and a measure of similarity between the two is produced.
Speech recognition (whether human or machine) is susceptible to error and may result in the misrecognition of words. If a word or phrase is incorrectly recognised, the speech recogniser may then offer another attempt at recognition, which may or may not be correct.
Various ways have been suggested for processing speech to select the best alternative matches between input speech and stored speech templates or models. In isolated word recognition systems, the production of alternative matches is fairly straightforward: each word is a separate `path` in a transition network representing the words to be recognised and the independent word paths join only at the final point in the network. Ordering all the paths exiting the network in terms of their similarity to the stored templates or the like will give the best and alternative matches.
In most connected recognition systems and some isolated word recognition systems based on connected recognition techniques however, it is not always possible to recombine all the paths at the final point of the network and thus neither the best nor alternative matches are directly obtainable form the information available at the exit point of the network. One solution to the problem of producing a best match is discussed in "Token Passing: a Simple Conceptual Model for Connected Speech Recognition Systems" by S. J. Young, N. H. Russell and J. H. S. Thornton, Cambridge University Engineering Department 1989, which relates to passing packets of information, known as tokens, through a transition network designed to represent the expected input speech. In general terms "network" includes directed acyclic graphs (DAGs) and trees. A DAG is a network with no cycles and a tree is a network in which the only meeting of paths occurs conceptually right at the end of the network. A token contains information relating to the partial path travelled as well as an accumulated score indicative of the degree of similarity between the input speech and the portion of the network processed thus far.
As described by Young et al, at each input of a frame of speech to a transition network, any tokens that are present at the input of a node are passed into the node and the current frame of speech matched within the word models associated with those nodes. At the output of each node, a token is issued with updated partial path information and score (the token having "travelled" through the model associated with the node). If

REFERENCES:
patent: Re31188 (1983-03-01), Pirz et al.
patent: 4348553 (1982-09-01), Baker et al.
patent: 4783804 (1988-11-01), Juang et al.
patent: 4803729 (1989-02-01), Baker
patent: 4829578 (1989-05-01), Roberts
patent: 4837831 (1989-06-01), Gillick et al.
patent: 4888823 (1989-12-01), Nitta et al.
patent: 4980918 (1990-12-01), Bahl et al.
patent: 5040127 (1991-08-01), Gerson
patent: 5228110 (1993-07-01), Steinbiss
patent: 5309547 (1994-05-01), Niyada et al.
patent: 5388183 (1995-02-01), Lynch
patent: 5390278 (1995-02-01), Gupta et al.
patent: 5524169 (1996-06-01), Cohen et al.
patent: 5583961 (1996-12-01), Pawlewski et al.
patent: 5621859 (1997-04-01), Schwartz et al.
Lamel et al, "An Improved Endpoint Detector for Isolated Word Recognition", IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 29, No. 4, Aug. 1981, New York, US, pp. 777-785.
IBM Techincal Disclosure Bulletin, vol. 34, No. 9, Feb. 1992, New York, US, pp. 267-269, "Method of Endpoint Detection".
Austin et al, "A Unified Syntax Direction Mechanism for Automatic Speech Recognition Systems Using Hidden Markov Models", ICASSP, vol. 1, May 23, 1989, Glasgow, pp. 667-670.
Young et al, "Token Passing" A Simple Conceptual Model for Connected Speech Recognition Syustems, Cambridge University Engineering Department, Jul. 31, 1989, pp. 1-23.
Kitano, An Experimental Speech-to-Speech Dialog Translation System, IEEE, June 1991, DM-Dialog, Carnegie Mellon University and NEC Corporation.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Task-constrained connected speech recognition of propagation of does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Task-constrained connected speech recognition of propagation of , we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Task-constrained connected speech recognition of propagation of will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-93127

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.