Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
1999-06-08
2001-10-23
Korzuch, William (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
C704S270000, C704S276000, C704S240000
Reexamination Certificate
active
06308157
ABSTRACT:
CROSS REFERENCE TO RELATED APPLICATIONS
(Not Applicable)
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
(Not Applicable)
BACKGROUND OF THE INVENTION
1. Technical Field
This invention relates to the field of computer speech recognition and more particularly to an efficient method and system for informing a system user of available voice commands.
2. Description of the Related Art
Speech recognition is the process by which an acoustic signal received by microphone is converted to a set of text words by a computer. These recognized words may then be used in a variety of computer software applications for purposes such as document preparation, data entry and command and control. Speech recognition is generally a difficult problem due to the wide variety of pronunciations, accents and other speech characteristics of individual speakers.
One of the difficult aspects of speech recognition systems relates to a user's ability to control and navigate through speech-enabled applications using various commands. In the simplest possible command and control grammar, each function that the system can perform has no more than one speech phrase associated with it. At the other extreme is a command and control system based on natural language understanding (NLU). In an NLU system, the user can express commands using natural language, thereby providing total linguistic flexibility in command expression. Current command and control systems are beyond the simple one function—one speech phrase grammar, but are not yet at NLU.
Much like the Disk Operating System (DOS), speech recognition systems that approach but do not achieve the flexibility of NLU recognize only a finite set of voice commands. These systems have little utility for users who do not know the commands available for performing desired functions. Thus, initially, users must be made aware of possible commands simply to perform any voice activated functions at all. On the other hand, more advanced users may wish to know whether a particular speech command will be recognized. Or, a user who knows one way of issuing a speech command might want to know other speech commands that achieve the same system function or operation. Thereby, the user may ascertain a more efficient speech command or one having a better recognition accuracy for that user than the speech command that he or she has been using.
Conventional speech recognition systems offer various means to present the user with a list of all valid speech commands, typically filtered in some way to facilitate a search for a particular command. This approach works reasonably well given fairly simple command and control grammars. However, as command and control grammars begin to approach NLU, the number of available ways to state commands increases to the point of making such approaches cumbersome. The problem is exacerbated when the speech recognition system is deployed in embedded systems, which have minimal display and memory capacities.
Some systems display a list of all possible commands based on the current state of the system. In these systems, the content and quantity of commands displayed at a first state is typically different from that displayed at a second state. If there are fewer possible commands at the second state, the number of possible commands displayed will be decreased, however, if there are more possible commands the displayed list will be lengthened. It is also possible that the same commands may be displayed at different states if the possible commands have not changed from the prior state(s). Thus, these systems do not necessarily reduce the quantity of commands displayed to the user.
Accordingly, there is a need to provide a more effective system and methods for informing a system user of voice commands.
SUMMARY OF THE INVENTION
The present invention provides a method and system for efficiently and intelligently selecting probable commands according to system events, and thereby, reducing the number of commands displayed to the user.
Specifically, the present invention operates on a computer system that is adapted for speech recognition so as to identify voice commands for controlling a speech-enabled application running on the system. The method and system is performed by receiving input from a user and monitoring the system so as to log system events and ascertain a current system state. The current system state and logged events are analyzed to predict a probable next event. Acceptable voice commands, which can perform the next event, are then identified. The user is notified of the acceptable voice commands, preferably, in a displayed dialog window.
The present invention thus provides the object and advantage of intelligently selecting only commands that the user is likely to execute next. Because they are based on a predicted next event, the quantity of selected commands will be much less than that in conventional speech recognition systems. And, the short list of commands, listed in order of the most likely commands, can be easily displayed and viewed by the user. Thus, the present invention provides the further object and advantage of being operable in embedded systems, which have limited system resources.
The events used to predict the next events can include commands, system control activities, timed activities, and application activation. Thus, multiple event-based parameters are analyzed during the predicting process. The present invention, therefore, provides an additional object and advantage in that it can accurately determine the commands of interest to the user. The prediction accuracy of the present invention is enhanced by statistically modeling the prior events in light of the current system state. Additionally, because prior events can be used to modify the statistical model, the speech recognition system can be accurately tailored to the voice characteristics and command choices of a given speaker or set of speakers using the system.
REFERENCES:
patent: 5664061 (1997-09-01), Andreshak et al.
patent: 5832439 (1998-11-01), Cox, Jr. et al.
patent: 5842161 (1998-11-01), Cohrs et al.
patent: 5855002 (1998-12-01), Armstrong
patent: 5890122 (1999-03-01), Van Kieeck et al.
patent: 6075534 (2000-06-01), VanBuskirk et al.
patent: 6076061 (2000-06-01), Kawasaki et al.
patent: 6085159 (2000-07-01), Ortega et al.
patent: 6101472 (2000-08-01), Giangarra et al.
patent: 6182046 (2001-01-01), Ortega et al.
IBM Technical Disclosure Bulletin, Integrated Audio-Graphics User Interface, vol. 33, No. 11, pp. 368-371, Apr. 1991.
Lewis James R.
Nassiff Amado
Ortega Kerry A.
Vanbuskirk Ronald E.
Wang Huifang
Azad Abul K.
International Business Machines Corp.
Korzuch William
LandOfFree
Method and apparatus for providing an event-based... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for providing an event-based..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for providing an event-based... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2598761