Selective merging of segments separated in response to a...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Selective merging of segments separated in response to a... Selective merging of segments separated in response to a...

: 2000-08-25
: 2003-07-29
: McFadden, Susan (Department: 2654)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: Reexamination Certificate
: active
: 06601028
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Technical Field of the Invention
The present invention relates to speech recognition and, more particularly, to selectively merging of segments separated in response to a break in an utterance.
2. Background Art
One component in a speech recognizer is the language model. A popular way to capture the syntactic structure of a given language is using conditional probability to capture the sequential information embedded in the word strings in sentences. For example, if the current word is W
1
, a language model can be constructed indicating the probabilities that certain other words W
2
, W
3
, . . . Wn, will follow W
1
. The probabilities of the words can be expressed such that P
21
is the probability that word W
2
will follow word W
1
, where P
21
=(W
2
|W
1
). In this notation, P
31
is the probability word W
3
will follow word W
1
; P
41
is the probability word W
4
will follow word W
1
, and so forth with Pn
1
being the probability that Wn will follow word W
1
. The maximum of P
21
, P
31
, . . . Pn
1
can be identified and used in the language model. The preceding examples are for bi-gram probabilities. Computation of tri-gram conditional probabilities is also well known.
Language models are often created through looking at written literature (such as newspapers) and determining the conditional probabilities of the vocabulary words with respect to others of the vocabulary words.
In speech recognition systems, complex recognition tasks, for example, such as long utterances, are typically handled in stages. Usually these stages include a segmentation stage which involves separating a long utterance into shorter segments. A first-pass within-word decoding is used to generate hypotheses for segments. A final-pass cross-word decoding generates the final recognition results with detailed acoustic and language models.
In the segmentation stage, long segments are typically chopped first at the sentence boundary and then at the word boundary (detected by a fast word recognizer). A typical way to detect sentence beginning and endings are between boundaries of silence (pause in speaking) detected by, for example, a mono-phone decoder. The assumption is that people momentarily stop speaking at the end of a sentence. The resulting segments are short enough (about 4 to 8 seconds) to ensure that they can be handled by the decoder given constrains of real-time pipeline and memory size. In the traditional decoding procedure, each short segment, which can be any part of a sentence, is decoded, and each transcription is merged to give the complete recognition result.
Another way in which segment (e.g., sentence) boundaries can be created is in response to unrecognizable non-speech noise, such as background noise.
The problem noticed by the inventor of the invention in this disclosure is that with the existing systems, the language model is not applied across segment boundaries. Accordingly, if a sentence is ended because of an unintended break (e.g., pause or noise), the language model will not be applied between the last word of the ending sentence and the first word of the beginning sentence. In the case in which the last word of the ending sentence and the first word of the beginning sentence are intended to be part of a continuous word stream, the benefits the language model could provide are not enjoyed with present recognition systems.

REFERENCES:
patent: 5805772 (1998-09-01), Chou et al.
patent: 5806030 (1998-09-01), Junqua
patent: 5884259 (1999-03-01), Bahl et al.
patent: 5983180 (1999-11-01), Robinson
patent: 6067514 (2000-05-01), Chen
patent: 6067520 (2000-05-01), Lee
patent: 6275802 (2001-08-01), Aelten
patent: 6292778 (2001-09-01), Sukkar

Affiliated with

Yan Yonghong

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Aldous Alan K.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Intel Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

McFadden Susan

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Skabrat Steven P.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Selective merging of segments separated in response to a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Selective merging of segments separated in response to a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Selective merging of segments separated in response to a... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3095418

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure