Transcription system for multiple speakers, using and...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S231000, C704S246000, C704S251000, C704S273000, C704S235000

Reexamination Certificate

active

06332122

ABSTRACT:

TECHNICAL FIELD
This invention relates to the field of speech recognition software and more particularly transcribing speech from multiple speakers input through a single channel using speech to text conversion techniques.
DESCRIPTION OF THE RELATED ART
Transcription is an old art that, up until the relatively recent past, has been performed by manually typing a recorded message into an electronic or physical document. More recently, speech-to-text conversion techniques have been employed to automatically convert recorded speech into text.
A difficulty arises with manual or automatic transcription techniques when multiple speakers are recorded onto a single recording (e.g., as in a recorded meeting or court proceeding). In most cases, it is desirable to identify which of the multiple speakers uttered the various phrases being transcribed. This is particularly true in court proceedings, for example, where an attorney may utter some phrases, a witness may utter others, and a judge may utter still others.
In order to automatically associate an individual with a phrase, it would be necessary to couple speaker recognition technology with the speech-to-text conversion software. Often the speech recognition systems currently available requires a speaker enroll in the system prior to use. A speaker dependent speech recognition model is developed for the enrolled speaker to optimize the quality of the text transcribed from the enrolled speaker's speech. In some cases, new people will join these meetings who have not enrolled on the speech system. In many cases, these users will participate in future meetings. Enrollment is not always feasible, and the necessity for enrollment would limit the usefulness of the transcription system.
Therefore, other methods of separating each speaker's uttered phrases are desirable. In some prior art techniques, each speaker is provided with a separate microphone, and the signals are combined into a single recording. A transcriber would then listen to the recordings and attempt to create a document by typing the speakers'statements in sequential order. However, this solution is non-optimal, because it requires the transcriber to differentiate between multiple speakers. Furthermore, any method whether manual or automated, requiring a separate channel, such as a microphone, for each speaker to input the speech into a recording increases the costs of the system.
What is needed is a method and apparatus for transcribing a recording of multiple speakers (enrolled and unenrolled) input through a single channel. What is further needed is a method and apparatus for reprocessing the transcribed text of unenrolled speakers using the input speech to optimizing the quality of the transcribed text, and future transcriptions from the speakers.
SUMMARY OF THE INVENTION
The invention provides a method of transcribing text from multiple speakers in a computer system having a speech recognition application. The system receives speech from one of a plurality of speakers through a single channel, and assigns a speaker ID to the speaker. The system then processes the speech into text using a speech recognition model, creating a document containing said text, and associates the processed speech and the text with the speaker ID assigned to the speaker. In order to detect a speaker change, the system monitors the speech input through the channel.
In another aspect of the present invention, the system transcribes the speech input into the system using a speech recognition model, and associates the transcribed speech and the text with the speaker ID assigned to the speaker. When there is a speaker change, the system assigns a different speaker ID to the different speaker. If the current speaker is an unenrolled speaker( i.e. the system does not have a speaker dependent speech recognition model associated with the assigned speaker ID), speech and text from the unenrolled speaker can be used to enroll the speaker.
According to yet another aspect, the invention may be embodied in a computer system having a text independent speech recognition application adapted for transcribing text from multiple speakers. In that case, the system includes application programming responsive to speech from a speaker of a plurality of speakers through a single channel. The system has additional programming for recognizing the voice of a speaker and assigning a speaker ID to the speaker, and for monitoring the speech for a speaker change to a different speaker.
Finally, the invention can take the form of a machine readable storage having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform a set of steps including:
receiving speech from one of a plurality of speakers through a single channel;
assigning a speaker ID to said speaker providing speech through said channel;
processing said speech into text using a speech recognition model;
creating a document containing said text;
associating said processed speech and said text with said speaker ID assigned to said speaker; and
monitoring said speech for a speaker change to a different speaker of said plurality of speakers.
These and still other objects and advantages of the present invention will be apparent from the description which follows. In the detailed description below, preferred embodiments of the invention will be described in reference to the accompanying drawings. These embodiments do not represent the full scope of the invention. Rather the invention can be employed in other embodiments. Reference should therefore be made to the claims herein for interpreting the breadth of the invention.


REFERENCES:
patent: 5526407 (1996-06-01), Russell et al.
patent: 5572624 (1996-11-01), Sejnoha
patent: 5895447 (1999-04-01), Ittycheriah et al.
patent: 6023675 (2000-02-01), Bennett et al.
patent: 6067517 (2000-05-01), Bahl et al.
patent: 6073101 (2000-06-01), Maes
patent: 6088669 (2000-07-01), Maes
patent: 6094632 (2000-07-01), Hattori

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Transcription system for multiple speakers, using and... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Transcription system for multiple speakers, using and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Transcription system for multiple speakers, using and... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2565479

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.