Speech recognition over lossy transmission systems

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S270100

Reexamination Certificate

active

06775652

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to speech recognition methods. In particular, the invention relates to speech recognition where speech data is transmitted and received over a lossy or corrupted communication link.
BACKGROUND OF THE INVENTION
Speech recognition has traditionally been performed using systems in which the transmission of speech data within the system is free of errors. However, the emergence of the Internet and of digital wireless technology has given rise to situations where this is no longer the case. In applications where speech is sampled and partially processed on one device and then packetized and transmitted over a digital network for further analysis on another, packets of speech data may be delayed, lost or corrupted during transmission.
This is a serious problem for current speech recognition technologies, which require data to be present even if it has additive noise. Existing Internet protocols for error free data transmission such as TCPIP are not suitable for interactive ASR (“Automatic Speech Recognition”) systems, as the retry mechanisms introduce variable and unpredictably long delays into the system under poor network conditions. In another approach, real time delivery of data packets is attempted, ignoring missing data in order to avoid introducing delays in transmission. This is catastrophic for current recognition algorithms as stated above.
It would be desirable to have a class of recognition algorithms and transmission protocols intermediate the conventional protocols which are able to operate robustly and with minimal delays or incomplete speech data under poor network conditions. Ideally, the protocol would have a mechanism by which loss and delay may be traded off, either in a fixed manner or dynamically, in order to optimize speech recognition over lossy digital networks, for example in a client-server environment.
SUMMARY OF THE INVENTION
A system and method according to the present invention provide speech recognition on speech vectors received in a plurality of packets over a lossy network or communications link. Typically, the recognition occurs at a server on speech vectors received from a client computer over the network or link. The system and method are able to operate robustly, despite packet loss or corruption during transmission. In addition, the system and method may dynamically adjust the manner in which packets are being transmitted over the lossy communications link to adjust for varying or degraded network conditions.
The method includes constructing for a speech recognizer multidimensional speech vectors which have features derived from a plurality of packets received over a lossy communications link. Some of the packets associated with each speech vector are missing or corrupted, resulting in potentially corrupted features within the speech vector. These potentially corrupted features are indicated to the speech recognizer when present. Speech recognition is then attempted by the speech recognizer on the speech vectors. If speech recognition is unsuccessful, a request for retransmission of a missing or corrupted packet is made over the lossy communications link when potentially corrupted features are present in the speech vectors.
The system for recognizing a stream of speech received as a plurality of speech vectors over a lossy communications link comprises a buffering and decoding unit coupled to the lossy communications link. The buffering and decoding unit receives a plurality of packets, identifies missing or corrupted packets, and constructs a series of speech vectors from the received packets. Each speech vector has a plurality of certain features and uncertain features. A speech recognizer is coupled to the buffering and decoding unit and classifies each speech vector as one of a plurality of stored recognition models based on only the certain features within the speech vector.
The system and method may include a capability to request retransmission of lost or corrupted packets or bandwidth renegotiation from a source of the packets over the lossy communications link. The renegotiation may include, for example, a request to include error correction or detection bits in the packets, a request to compress the packets prior to transmission, or a request to: discard less salient components of the signal to reduce bandwidth requirements, for example, by performing principle components analysis on speech data prior to packetization.


REFERENCES:
patent: 4897878 (1990-01-01), Boll et al.
patent: 5010553 (1991-04-01), Scheller et al.
patent: 5390278 (1995-02-01), Gupta et al.
patent: 5425129 (1995-06-01), Garman et al.
patent: 5440584 (1995-08-01), Wiese
patent: 5471521 (1995-11-01), Minakami et al.
patent: 5481312 (1996-01-01), Cash et al.
patent: 5550543 (1996-08-01), Chen et al.
patent: 5555344 (1996-09-01), Zunkler
patent: 5574825 (1996-11-01), Chen et al.
patent: 5617423 (1997-04-01), Li et al.
patent: 5617541 (1997-04-01), Albanese et al.
patent: 5768527 (1998-06-01), Zhu et al.
Hynek Hermansky, Sangita Tibrewaia, and Misha Paval, “Towards ASR on Partially Corrupted Speech,” Proc. IEEE Int. Conf. of Speech and Language Proc. ICSLP 96, pp. 462-465, Oct. 1996.*
Morris, A.C., Cooke M. P., and Green, P. D., “Some Solution to the Missing Feature Problem in Data Classification, with Application to Noise Robust ASR,” Proc. 1998 IEEE Int. Conf. on Acoust, Speech, and Sig, Proc., vol. 2, pp. 737-74-, 12-15 May 1998.*
P.D. Green et al.,Auditory Scene Analysis and Hidden Markov Model Recognition of Speech in Noise, Proceedings of ICASSP '95, pp. 401-404 (IEEE 1995).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech recognition over lossy transmission systems does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech recognition over lossy transmission systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech recognition over lossy transmission systems will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3326401

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.