Method and apparatus for evaluating the accuracy of a speech...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for evaluating the accuracy of a speech... Method and apparatus for evaluating the accuracy of a speech...

: 1999-05-04
: 2003-01-14
: Chawan, Vijay (Department: 3654)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S244000, C704S243000, C704S251000, C704S256000
: Reexamination Certificate
: active
: 06507816
: ABSTRACT:

CROSS REFERENCE TO RELATED APPLICATIONS
(Not Applicable)
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
(Not Applicable)
BACKGROUND OF THE INVENTION
1. Technical Field
This invention relates to the field of speech recognition computer applications and more specifically to a system for evaluating how accurately dictated words are discerned by a speech recognition system.
2. Description of the Related Art
Speech recognition is the process by which acoustic signals, received via a microphone, are “recognized” and converted into words by a computer. These recognized words may then be used in a variety of computer software applications. For example, speech recognition may be used to input data, prepare documents and control the operation of software applications. Speech recognition systems programmed or trained to the diction and inflection of a single person can successfully recognize the vast majority of words spoken by that person.
When it is to be used by a large number of speakers, however, it is very difficult for the speech recognition system to accurately recognize all of the spoken words because of the wide variety of pronunciations, accents and divergent speech characteristics of each individual speaker. Due to these variations, the speech recognition system may not recognize some of the speech and some words may be converted erroneously. This may result in spoken words being converted into different words (“hold” recognized as “old”), improperly conjoined spoken words (“to the” recognized as “tooth”), and spoken words recognized as homonyms (“boar” instead “bore”).
The erroneous words may also result from improper technique of the speaker. For example, the speaker may be speaking too rapidly or softly, slurring words or located an improper distance from the microphone. In this case, the recognition software will likely generate a large number of mis-recognized words.
Conventional speech recognition systems often include a means for the user to retroactively rectify these errors following the dictation. Typically, this is accomplished by providing a correction “window” for interfacing with the user. To simplify the correction process, most such correction windows provide a list of suggested or alternate words that in some way resemble the dictated words. This is accomplished by executing an algorithm as is known in the art, one much like a spell checking program in word processing applications, to search a system database for words with similar characteristics as the incorrectly recognized words. The algorithm outputs a list of one or more alternate words from which the user may select the intended word. If the intended words are not in the alternate list, the user may type in the words. After the intended word is selected or keyed in, the algorithm substitutes the corrected word for the erroneous word.
Although the alternate list simplifies the correction process, it does not aid in preventing the occurrence of mis-recognized text. For this, conventional speech recognition systems typically utilize “help screens” or online tutorials that the user may search to find information on a specific query or topic. Although the help files may provide information regarding possible solutions to the mis-recognition, they typically do not provide feedback specific to the speaker or dictation session.
Principally, this is because typical speech recognition systems do not track the frequency in which words are mis-recognized for each dictation session. Thus, it would be desirable to provide a simple correction-tracking system for evaluating the recognition accuracy of speech recognition systems, which can be employed to provide users with specific solutions to mis-recognition problems.
SUMMARY OF THE INVENTION
The present invention provides a simple method and system for evaluating the accuracy of a speech recognition system. The invention indexes one or more parameters for each dictation session and uses the parameters to calculate one or more accuracy ratios.
Specifically, the present invention provides a method and system for evaluating how accurately dictated words are recognized during a dictation session by a speech recognition system. The present invention counts the number of dictated words so as to create a total word index and counts the number of mis-recognized words to create a correction index. Then, the correction index is subtracted from the total word index to create a recognition index. An accuracy value is calculated as the ratio of the recognition index to the total word index.
The present invention tracks the total number of words dictated as well as the number of corrections for each dictation session. These values are used to estimate the accuracy of the speech recognition system for a specific dictation session. One object and advantage of the present invention is that the calculated accuracy ratio can be used to initiate problem solving applications or procedures based on the performance of each dictation session.
Another object and advantage of the present invention is that it does not require a great deal of computer memory or processing power. The present invention requires minimal mathematical manipulation and data storage. Simple counting and calculation processes are all that is needed to perform the present invention.
In a preferred embodiment of the present invention, the accuracy of the speech recognition system may also be evaluated according to the number of times corrected words are within a word database as well as the number of times the intended words are suggested in a list of one or more alternate terms. Specifically, each mis-recognized word is compared to at least one alternate word. An alternate index counts each time one of the alternate words is the word intended by the speaker. If the intended word is not within the alternate list, the user inputs one or more corrected words. Then, the number of corrected words not contained in the word database are counted to create an out-of-vocabulary index. The total word index is adjusted if the intended term that was suggested as an alternate or input by the user contained more than one word. In this embodiment, the correction index counts only the number of corrected words in either the alternate list or the word database, and the recognition index is this correction index subtracted from the total word index. The accuracy value is again calculated as the ratio of the recognition index to the total word index.
Thus, an additional object and advantage of the present invention is that it provides an accuracy value according to one or more parameters. This affords a more thorough evaluation of the accuracy of the speech recognition system.
The system and method of the present invention can also sum each respective index for each dictation session and calculate one or more overall accuracy ratios. Thus, the invention provides yet another object and advantage in that it can also approximate the accuracy of the speech recognition system independent of specific users or dictation sessions.

REFERENCES:
patent: 5712957 (1998-01-01), Waibel et al.
patent: 5799213 (1998-08-01), Mitchell et al.
patent: 5829000 (1998-10-01), Huang et al.
patent: 5855000 (1998-12-01), Waibel et al.
patent: 5864805 (1999-01-01), Chen et al.
patent: 5909667 (1999-06-01), Leontiades et al.
patent: 5960447 (1999-09-01), Holt et al.
patent: 5970460 (1999-10-01), Bunce et al.
patent: 6064959 (2000-05-01), Young et al.
patent: 6138099 (2000-10-01), Lewis et al.
patent: 6185530 (2001-02-01), Ittycheriah et al.
patent: 6195637 (2001-02-01), Ballard et al.

Affiliated with

Ortega Kerry A.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Chawan Vijay

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

International Business Machines - Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Senterfitt Akerman

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for evaluating the accuracy of a speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for evaluating the accuracy of a speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for evaluating the accuracy of a speech... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3039032

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure