Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
1999-06-30
2001-10-23
Dorvil, Richemond (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
C704S235000, C707S793000
Reexamination Certificate
active
06308158
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to document creation systems and, more particularly, to such systems in which speech recognition is performed to convert speech signals into text documents.
BACKGROUND OF THE INVENTION
Recent years have seen significant advances in practical applications of continuous speech recognition (CSR). For example, it is now possible to purchase commercially available CSR application software packages suitable for installation and use in a conventional personal computer for home or office.
It has also been proposed to provide CSR as an additional feature of conventional central dictation systems.
Known central dictation systems take a number of forms. In one typical variety of central dictation system, which is frequently used in hospitals, the hard disk drive or drives of a server computer is used as the central voice recording device. Dictate stations are provided at a number of locations in the hospital to permit physicians to dictate directly onto the central recorder. The dictate stations may be in the form of hand microphones or telephone-style handsets, and are connected by analog or digital signal paths to the central recorder. The dictate stations customarily include control switches which allow the authors to control conventional dictation functions, such as record, stop, rewind, fast forward, play, etc. In addition, the dictate stations typically include a keypad and/or bar code reader to permit the author to enter data to identify himself or herself as well as the patient to whom the dictated material is related.
Typical central dictation systems also include a number of transcription stations connected to the central recorder. The transcription stations commonly include a personal computer which runs a word processing software package, as well as listening and playback-control devices which allow the transcriptionist listening access to voice files stored in the central recorder and control over playback functions. Dictation jobs awaiting transcription in the central recorder are assigned to transcriptionists according to conventional practices.
It has been proposed to incorporate CSR functions in the central recorder/server of a central dictation system. The CSR function is applied at the server to a dictation file to generate a text document, and then the text document and voice file are made available to transcriptionists who edit and correct the text while reviewing the voice files. The preprocessing of the voice files by CSR can be expected to produce significant improvements in productivity for the transcription function.
In a prior co-pending patent application Ser. No. 09/099,501, which is commonly assigned with the present application and entitled “Dictation System Employing Computer-To-Computer Transmission of Voice Files Controlled by Hand Microphone”, it was proposed to provide a central dictation system utilizing networked computers. According to this proposal, some of the networked computers have hand microphones interfaced thereto and constitute dictation stations, whereas others of the networked personal computers have headsets and foot pedals interfaced thereto and constitute transcription stations. The e-mail system of the computer network is used to transport voice files from the dictation stations and to the transcription stations. In addition, the e-mail system may be used to forward dictation files into a central dictation recorder, from which transcriptions can play back the dictation files.
It would be desirable to implement CSR functions in a central dictation system in a manner which increases capacity of the system and lessens burdens on the central recorder/server. It would also be desirable to provide an all-digital system in which the transmission bandwidth of the system is not unduly burdened.
OBJECTS AND SUMMARY OF THE INVENTION
It is an object of the invention to provide a central dictation system in which continuous speech recognition processing is employed.
It is a further object of the invention to provide a central dictation system which makes speaker-independent CSR available at all input stations, while minimizing burdens upon transmission facilities and the central server computer of the system.
According to an aspect of the invention, there is provided a method of operating a document creation system, the system including a plurality of voice input stations and a server computer connected to exchange data signals with the voice input stations, the method including the steps of logging on to one of the voice input stations; placing the one of the voice input stations in a training mode for training a speech recognition algorithm; dictating into the one of the voice input stations to generate speech signals; analyzing the speech signals to generate acoustic reference files; and uploading the acoustic reference files to the server computer.
According to further aspects of the invention, the logging-on step includes inputting ID data for identifying a person who is performing the logging-on step, and the method further includes the step of uploading the ID data to the server together with the acoustic reference files.
According to further aspects of the invention, the method further includes second logging-on to a second one of the voice input stations, the second logging-on step including inputting author ID data for identifying an author who is performing the second logging-on step, the author being the person who performed the logging-on step previously referred to in connection with the training mode; transmitting to the server computer the author ID data inputted in the second logging-on step; in response to the transmitting step, downloading from the server computer to the second one of the voice input stations the acoustic reference files uploaded to the server computer in the uploading step; dictating into the second one of the voice input stations to generate second speech signals; and applying a speech recognition algorithm to the second speech signals at the second one of the voice input stations by using the downloaded acoustic reference files, to generate text document data from the second speech signals.
According to still further aspects of the invention, the second speech signals are digital signals generated at a first data rate, and the method of the present invention further includes transcoding the second speech signals to form transcoded speech signals which have a second data rate which is lower than the first data rate. For example, the first data rate is preferably on the order of 22 kilobytes per second, which is high enough to support satisfactory performance of the speech recognition algorithm. After or in parallel with the speech recognition processing, the speech signals are transcoded down to, say, one kilobyte per second. At the lower data rate, although some fidelity is lost, the sound quality is still adequate for the purposes of audibly reviewing the transcoded voice file. The transcoded voice file is uploaded to the server computer along with the text document created by applying the voice recognition algorithm to the high bandwidth speech signals.
According to other aspects of the invention, the document creation system includes a plurality of document review stations (which may also be regarded as transcription stations). The method of the invention preferably includes downloading the transcoded speech signals, the text document data and author ID data from the server computer to one of the document review stations to which a particular dictation job has been assigned. The text document is then edited and corrected at the document review station by the transcriptionist, who audibly plays back and reviews the transcoded speech signals and compares the text document which resulted from the speech recognition algorithm with the transcoded speech signals.
According to another aspect of the invention, there is provided a central dictation system, including a server computer, a plurality of voice input stations, and a data communication network connecti
Howes Simon L.
Kuhnen Regina
Larossa-Greene Channell
Dictaphone Corporation
Dorvil Richemond
Kramer Levin Naftalis & Frankel LLP
McFadden Susan
Neff, Esq. Gregor N.
LandOfFree
Distributed speech recognition system with multi-user input... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Distributed speech recognition system with multi-user input..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Distributed speech recognition system with multi-user input... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2595141