Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1998-01-05
2001-10-09
Korzuch, William (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S275000
Reexamination Certificate
active
06301560
ABSTRACT:
TECHNICAL FIELD
This invention relates to discrete speech recognition systems. More particularly, this invention relates to discrete speech recognition systems with ballooning grammars. This invention further relates to vehicle computer systems that implement such discrete speech recognition systems.
BACKGROUND OF THE INVENTION
Two common types of speech recognition systems are continuous and discrete. Continuous speech recognition systems detect and discern useful information from continuous speech patterns. In use, an operator may speak phrases and sentences without pausing and the continuous speech recognition system will determine the words being spoken. Continuous speech recognition systems are used, for example, in voice-input word processors that enable operators to dictate letters directly to the computer.
In contrast, discrete speech recognition systems are designed to detect individual words and phrases that are interrupted by intentional pauses, resulting in an absence of speech between the words and phrases. Discrete speech recognition systems are often used in “command and control” applications in which an operator speaks individual commands to initiate corresponding predefined control functions. In a typical use, the operator speaks a command, pauses while the system processes and responds to the command, and then speaks another command. The system detects each command and performs the associated function.
This invention is directed to the discrete class of speech recognition systems.
A discrete speech recognition system employs a complete list of recognized words or phrases, referred to as the “vocabulary.” A subset of the vocabulary that the recognition system is attempting to detect at any one time is known as the “grammar.” In general, the smaller the active grammar, the more reliable the recognition because the system is only focusing on a few words or phrases. Conversely, the larger the active grammar, the less reliable the recognition because the system is attempting to discern a word or phrase from many words or phrases.
Accordingly, one design consideration for discrete speech recognition systems is to devise grammars that present useful command options, while being reliably detectable.
One conventional approach is to construct a large grammar that encompasses each command option.
FIG. 1
shows how this conventional approach might be applied to control an automobile radio. In this example, suppose the system is designed to allow the user to control the radio and access his/her favorite radio stations using voice commands. Using a large-size active grammar, a default radio grammar
20
might include the radio control words-“AM”, “FM”, “Seek”, and “Scan”-and all of the preset radio stations. A corresponding command function is associated with each grammar word, as represented in Table 1.
TABLE 1
Default Grammar
Word/Phrase
Command Function
AM
Sets the radio to AM band.
FM
Sets the radio to FM band.
Seek
Directs the radio to seek to a new station.
Scan
Directs the radio to scan for a new station.
One
Sets the radio to preset station 1.
Two
Sets the radio to preset station 2.
Three
Sets the radio to preset station 3.
Four
Sets the radio to preset station 4.
Five
Sets the radio to preset station 5.
Six
Sets the radio to preset station 6.
Seven
Sets the radio to preset station 7.
Eight
Sets the radio to preset station 8.
Nine
Sets the radio to preset station 9.
Ten
Sets the radio to preset station 10.
The speech recognition system actively tries to recognize one of these words when the operator speaks. When a grammar word is detected, the speech recognition system performs the appropriate function. Suppose the operator says is the word “AM”. The discrete speech recognition system detects the active word
22
and performs the corresponding function
24
to set the radio to the AM band.
As noted above, a drawback with presenting a large all-encompassing grammar is that there is a greater likelihood of false recognition by the speech system. For instance, the system may experience trouble distinguishing between the words “FM” and “Seven” when both are spoken rapidly and/or not clearly enunciated.
Another conventional approach is to construct a small default grammar and to switch to a new grammar upon detection of one or more keywords.
FIG. 2
shows how this conventional approach might be applied to control an automobile radio. With this approach, a default radio grammar
30
might include only the radio control words-“AM”, “FM”, “Seek”, “Scan”, and “Preset”. A corresponding command function is associated with each grammar word, as represented in Table 2.
TABLE 2
Default Grammar
Word/Phrase
Command Function
AM
Sets the radio to AM band.
FM
Sets the radio to FM band.
Seek
Directs the radio to seek to a new station.
Scan
Directs the radio to scan for a new station.
Preset
Keyword to bring up preset station grammar
Upon recognition of the keyword “preset”, the speech recognition system changes to a new grammar
32
for detecting the preset station numbers. Table 3 lists the new preset station grammar.
TABLE 3
Preset Station Grammar
Word/Phrase
Command Function
One
Sets the radio to preset station 1.
Two
Sets the radio to preset station 2.
Three
Sets the radio to preset station 3.
Four
Sets the radio to preset station 4.
Five
Sets the radio to preset station 5.
Six
Sets the radio to preset station 6.
Seven
Sets the radio to preset station 7.
Eight
Sets the radio to preset station 8.
Nine
Sets the radio to preset station 9.
Ten
Sets the radio to preset station 10.
The speech recognition system actively tries to recognize one of these words from the preset station grammar. Suppose the operator says the word “One”. The discrete speech recognition system detects the active word
34
and performs the corresponding function
36
to set the radio to the preset station
1
.
A drawback with this system is navigation of the grammars. An operator may call out a keyword in one grammar, causing the system to switch to a different grammar, and then subsequently be interrupted (e.g., driving in traffic) and forget which grammar is currently active upon returning his/her attention to the radio. For instance, suppose the operator had called out “preset” to get the preset station grammar of Table 3 and then subsequently became interrupted. The operator may then wish to seek or scan, but the system will not recognize these commands because the active grammar is currently looking for a preset station number.
Accordingly, there is a need to improve techniques for presenting grammars in discrete speech recognition systems for such applications as operating a vehicle radio.
SUMMARY OF THE INVENTION
This invention concerns a discrete speech recognition system with a ballooning grammar. The system begins with a default grammar that has both keywords and non-keywords. Upon detecting a word that is not a keyword in the default grammar, the speech recognition system simply performs the function associated with the detected word. Upon detecting a keyword in the default grammar, the speech recognition system expands its active grammar list from the default grammar to a ballooned grammar that include both the words in the default grammar and the additional words triggered by detection of the keyword. In this manner, the operator still has the option to select a word from the original grammar, or choose a word from the additional list.
The ballooned grammar remains active until the operator makes a new selection. With a bi-level system (meaning the system balloons the grammar only once), the speech recognition system returns to the default grammar. In a higher level system in which multiple ballooned grammars may be used, the operator can work his/her way through the grammars, with each level including the words from the default grammar and words that are added along the way. In one implementation of a higher level system, words that are effectively rendered useless by subsequent selections can be removed from the active ballooned vocabulary. At the end of the traversal through the levels, the sp
Korzuch William
Lee & Hayes PLLC
Lerner Martin
Microsoft Corporation
LandOfFree
Discrete speech recognition system with ballooning active... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Discrete speech recognition system with ballooning active..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Discrete speech recognition system with ballooning active... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2554368