Speech to text conversion

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Speech to text conversion Speech to text conversion

: 1998-03-27
: 2001-01-09
: Hudspeth, David R. (Department: 2741)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S270000
: Reexamination Certificate
: active
: 06173259
: ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to apparatus and methods for speech to text conversion using automatic speech recognition, and has various aspects.
BACKGROUND OF THE INVENTION
Automatic speech recognition, as such, is known from, for example, “Automatic Speech Recognition” by Kai-Fu Lee, Kluwer Academic Publishers 1989.
Conventional known systems for converting speech to text involving automatic speech recognition are desktop stand alone systems, in which each user needs his or her own system. Such known speech to text conversion systems have been produced by such companies as International Business Machines, Kurzweil Applied Intelligence Inc and Dragon Systems.
These known systems are able to transcribe human speech to text, albeit imperfectly. The text results are presented to the user after a small delay whilst he or she is still dictating. This has a number of disadvantages. Firstly the instantaneous text presentation can confuse and alter the behaviour of the user who is speaking. Also, it requires that the user must themselves correct errors, usually using a text editor. Accordingly, the user must switch between the tasks of speaking and correcting, resulting in inefficiency.
IBM and Dragon have produced desktop speech to text conversion systems which are adapted to understand the speech of a particular user.
A method of sending text data together with speech data in a single file over a computer network is known from U.S. Pat. No. 55769.
In a first aspect, the present invention relates to a speech to text convertor comprising at least one user terminal for recording speech, at least one automatic speech recognition processor, and communication means operative to return the resulting text to a user, in which said at least one user terminal is remote from said at least one automatic speech recognition processor, the speech to text convertor including a server remote from said at least one user terminal, the server being operative to control transfer of recorded speech files to a selected automatic speech recognition processor.
BRIEF SUMMARY OF THE INVENTION
Preferably, the or each user terminal communicates the recorded speech files to the remote server by electronic mail.
The use of electronic mail enables relaying information from one terminal or machine to another, and preferably allows different operations (including entry to a dictation terminal, application of automatic speech recognition, and operation of a correction terminal) to occur on isolated computer networks. The machines which perform these separate operations need not be connected to any of the same equipment, or a common network, other than that loose (and usually global) network defined by an e-mail system. Furthermore, the respective machines and terminals need not be operated at the same time. All operations can be conducted in a manner that is off-line, ie. involving batch rather than real time processing. A correction terminal preferably must receive the data from the said at least one automatic recognition processor prior to or simultaneously with the initiation of the correction process at the correction terminal (and, likewise the automatic speech recognition process cannot proceed before receiving data from a user terminal).
The term “electronic mail” is intended to include Internet “File Transfer Protocol” and “World Wide Web”, the latter being based on the Hypertext Transfer Protocol (HTTP).
The automatic speech recognition processors are preferably distributed remote from the server. The server preferably communicates with at least one speech recognition processor by electronic mail.
The text files resulting from automatic speech recognition are preferably sent to correction units. The correction units are preferably remote from the automatic speech recognition processors. Communications from the automatic speech recognition processors to each correction unit are preferably undertaken under the control of the server, and preferably by electronic mail. The correctors are preferably remotely distributed.
The corrector units can preferably communicate to said at least one user terminal by electronic mail.
In a second aspect, the invention relates to a speech to text convertor comprising at least one user terminal for recording speech, at least one automatic speech recognition processor, and communication means operative to return the resulting text to a user, in which said at least one user terminal is remote from said at least one automatic speech recognition processor, in which electronic mail is used to send text data resulting from automatic speech recognition together with the recorded speech data to a correction unit for manual correction. The text data and speech data are preferably sent together in a single file. The file preferably also includes timing data for relating text to speech. Preferably each word of text has an associated start and end time recorded as part of the timing data. The text data can include text alternatives corresponding to a spoken word.
Preferably said at least one user terminal and said at least one automatic speech recognition processor communicate using electronic mail.
Electronic mail can be used for communications between each of said at least one user terminal and a remote server which is operative to control assignment of the speech files to the automatic speech recognition processors.. The processors can be distributed remote from each other and the server. Electronic mail can also be used to send text files to output terminals.
As regards the invention in both first and second aspects:
The recorded speech is preferably continuous speech.
The server acts to control assignment of recorded speech files for processing to automatic speech processors by queuing the received speech files and submitting them according to predetermined rules. This allows more efficient use of the available automatic speech recognition resources, according to an off-line or batch processing scheme.
Speech to text conversion can be done as a single fully automatic operation, or as a part-automatic and part-manual operation using the automatic speech recognition processor and corrector unit respectively.
Undertaking the speech to text conversion in a non-interactive and off-line basis prevents the user switching repeatedly between speech recording and speech correction tasks. This results in improved efficiency.
The predetermined rule or rules by which the server queues jobs can be according to urgency or user priority ratings.
The corrector unit preferably includes a visual display unit for display of the text and a manual interface, such as a keyboard and/or mouse a foot pedal control, usable to then select text portions.
Correction is effected by the manual operator. The correction can be recorded and transmitted back to the automatic speech recognition processor which undertook the automatic speech recognition for adaption of the operation of the automatic speech recognition processor. These corrections are preferably sent by electronic mail. The adaption has the effect of making the automatic speech recognition more accurate in future processing.
The recorded speech can be sent to the selected correction unit for correction of the text file resulting from automatic speech correction. The server can control this selection. The choice of correction unit can depend on the accent of the speaker of the recorded speech, in particular the files can be sent to a correction unit in an area where that accent is familiar, or to a correction unit where the particular human corrector is familiar with that accent.
The present invention relates in its various aspects both to apparatus and to corresponding methods.
In a third aspect, the present invention relates to a method of operating apparatus, the apparatus comprising a plurality of connected nodes, the method comprising the steps at a first node of automatically reading an instruction from a sequential series of instructions, executing the instruction which provides resultant variable values, and storing the result

Affiliated with

Bijl David

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Hyde-Thomson Henry

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Azad Abul K.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Hudspeth David R.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kilpatrick & Stockton

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Speech Machines PLC

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech to text conversion does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Speech to text conversion, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech to text conversion will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2553578

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure