Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis
Reexamination Certificate
2000-09-20
2004-09-21
Knepper, David D. (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Synthesis
C704S270000
Reexamination Certificate
active
06795806
ABSTRACT:
CROSS REFERENCE TO RELATED APPLICATIONS
(Not Applicable)
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
(Not Applicable)
BACKGROUND OF THE INVENTION
1. Technical Field
This invention relates to the field of speech recognition, and more particularly, to a method for enhancing discrimination between and among user dictation, user voice commands, and text.
2. Description of the Related Art
Speech recognition is the process by which an acoustic signal received by microphone is converted to text by a computer. The recognized text may then be used in a variety of computer software applications for purposes such as document preparation, data entry, and command and control. Speech dictation systems further offer users a hands free method of operating computer systems.
In regard to electronic document preparation, presently available speech dictation systems provide user voice commands enabling a user to select a portion of text in an electronic document. Such user voice commands typically employ a syntax such as “SELECT <text>”, where the user voice command “SELECT” signals that the text following the command should be selected or highlighted. After a portion of text has been selected, the user can perform any of a series of subsequent operations upon the selected text.
Thus, if a user says, “SELECT how are you”, the speech dictation system will search for the text phrase “how are you” within a body of text in the electronic document. Once located in the body of text, the phrase can be selected or highlighted. Subsequently, the user can perform an operation on the selected text such as a delete operation, a bold/italic/underline operation, or a correction operation. In further illustration, once the text “how are you” is highlighted, that user selected portion of text can be replaced with different text derived from a subsequent user utterance. In this manner, users can perform hands-free correction of an electronic document.
Presently, known implementations of the “SELECT” command, or other similar user voice commands for selecting text, suffer from several disadvantages. One such disadvantage is that there may be multiple occurrences of the phrase or word that the user would like to select within a body of text. For example, within a body of text, there are likely to be many occurrences of the word “the”. Thus, if the user says “SELECT the”, the speech dictation system may not be able to determine which occurrence of the word “the” the user would like to select.
In addressing this problem, conventional speech dictation systems rely upon a system of rules for determining which occurrence of the user desired word or phrase the user would like to select. For example, a speech dictation system can begin at the top of the active window and select the first occurrence of the word or phrase. However, if the user did not want to select the first occurrence of the word or phrase, a conventional speech dictation system can provide the user with the ability to select another occurrence of the word. In particular, some conventional speech dictation systems provide navigational voice commands such as “NEXT” or “PREVIOUS”.
By uttering the voice command “NEXT” the user instructs the speech dictation system to locate and select the next occurrence of the desired word or phrase. Similarly, the command “PREVIOUS” instructs the speech dictation system to locate and select the previous occurrence of the desired word or phrase. Although such conventional systems allow the user to navigate to the desired occurrence of a particular word or phrase, users must develop strategies for navigating to the desired occurrence. This can result in wasted time and user frustration, especially in cases where the user perceives the speech dictation system to be inaccurate or inefficient.
Another disadvantage of conventional text selection methods within conventional speech dictation systems is that when searching for the user specified word or phrase, such speech dictation systems typically search the entire portion of a body of text appearing on the user's screen. Each word appearing on the user's screen is activated within the speech dictation system grammar and appears to the speech dictation system as an equally likely candidate. Because the user desires only a single word or phrase, enabling and searching the entire portion of the body of text appearing on the user's screen can be inefficient. Moreover, the technique can increase the likelihood that a misrecognition will occur.
Yet another disadvantage of conventional text selection methods within conventional speech dictation systems is that often it is not readily apparent to the speech dictation system whether a user has uttered a word during speech dictation or a voice command, for example a voice command that activates a drop-down menu. For instance, if a user utters the word “File”, depending upon the circumstance, the user could either intend to activate the File menu in the menu bar or insert the word “file” in the electronic document. Accordingly, it is not always apparent to the conventional speech dictation system whether a user utterance is a voice command or speech dictation.
Consequently, although presently available speech dictation systems offer methods of interacting with a computer to audibly command an application, to provide speech dictation in an electronic document and to select text within the electronic document, there remains a need for an improved method of discriminating between user voice commands, user dictations, text, and combinations thereof.
SUMMARY OF THE INVENTION
The invention disclosed herein provides a method and apparatus for discriminating between different occurrences of text in an electronic document and between an instance of a voice command and an instance of speech dictation through the utilization of an eye-tracking system in conjunction with a speech dictation system. The method and apparatus of the invention advantageously can include an eye-tracking system (ETS) for cooperative use with a speech dictation system in order to determine the focus of point of a user's gaze during a speech dictation system. In particular, the cooperative use of the ETS with the speech dictation system can improve accuracy of the “SELECT” user voice command functionality, or any other user voice command for selecting a portion of text within a body of text in a speech dictation system. The use of the ETS in the invention also can improve system performance by facilitating discrimination between user dictation and a voice command.
In accordance with the inventive arrangements, a method for searching for matching text in an electronic document can include identifying a focus point in a user interface and defining a surrounding region about the focus point. Notably, the surrounding region can include a body of text within a user interface object configured to receive speech dictated text. Additionally, the method can include receiving a voice command for selecting specified text within the electronic document and searching the body of text included in the surrounding region for a match to the specified text. Significantly, the search can be limited to the body of text in the surrounding region.
A method for searching for matching text in an electronic document can further include expanding the surrounding region to include an additional area of the user interface if a match to the specified text is not found in the body of text in the searching step. Notably, the additional area included by the expansion can include additional text. Accordingly, the additional text can be searched for a match to the specified text. Finally, as before, the search can be limited to the body of text and the additional text.
In a representative embodiment of the present invention, the expanding step can include expanding the surrounding region outwardly from the focus point by a fixed increment. Alternatively, the expanding step can include expanding the surrounding region by a fixed quantity of text adjacent to the bo
Lewis James R.
Ortega Kerry A.
Akerman & Senterfitt
International Business Machines - Corporation
Knepper David D.
LandOfFree
Method for enhancing dictation and command discrimination does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for enhancing dictation and command discrimination, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for enhancing dictation and command discrimination will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3245972