Testing speech recognition systems using test data generated...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S260000, C704S251000

Reexamination Certificate

active

06622121

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to speech recognition systems and, more particularly, to methods and systems for testing speech recognition systems.
BACKGROUND OF THE INVENTION
Speech recognition is an important aspect of furthering man-machine interaction. The end goal in developing speech recognition systems is to replace the keyboard interface to computers with voice input. This may make computers more user friendly and enable them to provide broader services to users. To this end, several systems have been developed. The effort for the development of these systems aims at improving the transcription error rate on real speech in real-life applications. In the course of developing these systems, one needs to compare different approaches by running tests over standardized test data which are generally recorded speech of a reference script.
The reason for this is that for fair comparisons and reproducible results, it is essential that all experiments be carried out with exactly the same speech input. Therefore, all systems will be tested by the same speakers reading the same script (text or voice commands). Since it is impossible for a speaker to utter the words twice in exactly the same way, and since the background noise would also change from utterance to utterance, the test speech data is recorded once and for all, and then reused for all the tests.
In particular when the objective is to test the resilience of the system to dictation of very varied texts, to obtain any kind of statistically significant results, it becomes necessary to record very large bodies of text corpora spoken by the test speaker(s).
Recording of this large amount of text is commonly realized by a human speaker (or a set of human speakers) who reads reference texts to a microphone in a controlled fashion. The main drawback of human dictation is that the collecting thereof is costly in that it is very labor-intensive to record a massive amount of test material in a controlled fashion.
As a consequence of the foregoing difficulties in the prior art, it is an object of the present invention to provide speech recognition systems and methods wherein the test speech material is provided independently of a human speaker.
SUMMARY OF THE INVENTION
The present invention solves the foregoing need by providing systems and associated methods for testing speech recognition systems in which the speech recognition device to be tested is directly monitored in accordance with a text-to-speech device. The collection of reference texts to be used by the speech recognition device is provided by a text-to-speech device preferably implemented within the same computer system.
In one embodiment of the invention, the method comprises the steps of:
a) generating a digital audio file from a reference text using a text-to-speech device, the digital audio file being stored within a storage area of a computer system; and
b) reading the digital audio file using a speech recognition device to generate a decoded text representative of the reference text.
It is known that the phrase “decoded text” may be used interchangeably with the phrase “recognized text.”
In a further step, alignment of the reference text and the decoded text is accomplished and an error report representative of the recognition rate of the speech recognition device is generated.
Preferably, the step of generating a digital audio file is realized by:
a1) tokenizing an initial text stored on a storage area of the computer system to generate a tokenized text;
a2) marking-up of the tokenized text to generate a marked text; and
a3) synthesizing the marked text to generate the digital audio file.
In an alternate embodiment, the text-to-speech device is implemented within a first computer system while the speech recognition device is implemented within a second computer system. The method then comprises the steps of:
a) generating a synthetic speech from a reference text using the text-to-speech device; and
b) processing the synthetic speech to generate a decoded text representative of the reference text using the speech recognition device.


REFERENCES:
patent: 4349700 (1982-09-01), Pirz et al.
patent: 4783803 (1988-11-01), Baker et al.
patent: 5313531 (1994-05-01), Jackson
patent: 5572570 (1996-11-01), Kuenzig
patent: 5615299 (1997-03-01), Bahl et al.
patent: 5652828 (1997-07-01), Silverman
patent: 5715369 (1998-02-01), Spoltman et al.
patent: 5751904 (1998-05-01), Inazumi
patent: 5754978 (1998-05-01), Pérez-Méndez et al.
patent: 5758320 (1998-05-01), Asano
patent: 5799278 (1998-08-01), Cobbett et al.
patent: 5806037 (1998-09-01), Sogo
patent: 5826232 (1998-10-01), Gulli
patent: 5878390 (1999-03-01), Kawai et al.
patent: 5884251 (1999-03-01), Kim et al.
patent: 5890117 (1999-03-01), Silverman
patent: 5899972 (1999-05-01), Miyazawa et al.
patent: 6119085 (2000-09-01), Lewis et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Testing speech recognition systems using test data generated... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Testing speech recognition systems using test data generated..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Testing speech recognition systems using test data generated... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3048507

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.