Voice recognition of telephone conversations

Telephonic communications – Audio message storage – retrieval – or synthesis – Multimedia system

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C379S265080, C379S068000, C379S088010, C379S088270, C379S267000

Reexamination Certificate

active

06278772

ABSTRACT:

FIELD OF INVENTION
This invention relates to voice recognition of telephone conversations.
BACKGROUND OF INVENTION
Some institutions, particularly in the finance and insurance industries, record their conversations with their clients for evidence in case of a dispute. For example, an insurance company may record half a million conversations a year, mainly the details of insurance claims, to ensure that the details remain consistent. Increasingly, such conversations are stored as voice data, usually on optical storage media. They can then be associated with the customer's records and retrieved if necessary.
One problem with voice data is that it requires large storage resources even when compressed and therefore takes a long time to retrieve. One second of uncompressed voice data typically requires 64k bytes of memory and 12.8k bytes of memory with 80% compression of the voice data. Based on these assumptions one minute of compressed conversation requires 770k bytes of memory and a medium sized company with 20operators making an average of 4 hours conversation each per day would require about 3700M bytes of storage or 6 CDs per day at current storage capacities (about 700M bytes per CD at present). Normally the required conversation will be on a CD ROM in an archive and would need to be manually retrieved and loaded into a CD ROM drive to access the data. A large number of CD ROMs are required which can take time to physically load into and out of a CD ROM read/write device even when using a jukebox machine. A manual search for a particular conversation can take considerable time even if the approximate date and time is known. Once one has found the particular conversation one still needs to locate the relevant part and this takes further time.
A further problem is that the required information cannot be directly located. The whole conversation needs to be listened to before the relevant information can be located.
Furthermore it is difficult to analyze the voice data for statistical information, for example for marketing and management purposes.
European Patent publication 0633642 discloses audio data processing apparatus connected to a telephone network for recording voice data from a caller, for instance, in placing a catalog order in response to computer generated prompts. The apparatus records the voice data associated with the computer generated prompts, performs voice recognition on the voice data and uses the text data to place the catalog order.
European Patent publication 0664636 discloses an audio conferencing system comprising a network of workstations having voice conference and speech recognition software. Speech from one workstation is converted into text by the recognition software on that workstation and both the speech and text are transmitted to the other workstations in the conference by the conferencing software. The received text, together with locally generated text is stored in a text file so as to produce a set of minutes for the audio conference.
SUMMARY OF INVENTION
According to an aspect of the invention there is provided a method for processing a telephone conversation in a voice processing system, said method comprising the steps of: receiving voice data representing the telephone conversation comprising a first series of speech data from an agent interspersed with a second series of speech data from a client; storing the first and second series of speech data as a single body of voice data for later retrieval; performing a voice recognition function on the voice data to convert them into text data representing the telephone conversation; and storing the text data for later retrieval.
Such a solution allows entire days/weeks or even months of conversation to be stored as text and accessed on-line for quick and easy retrieval. One minute of conversation may comprise about 300 spoken words which when converted into text requires of order 3k bytes of memory or about 0.4% of the memory that compressed voice uses. For instance, it is possible to search for conversations on a particular topic using key words over several months of stored conversation. The conversation details are fully available to the searcher at a glance whereas previously one would have to listen to the entire conversation to find the relevant parts. This is much more convenient for agents to handle resulting in increased staff efficiency. The medium sized company referred to above could store 50 days worth of conversation on a single CD ROM.
Since the memory space required is considerably smaller for text storage it is possible to keep many days of conversation in directly accessible memory that may be searched by a computer. Furthermore it is possible to search for keywords typed from the keyboard and it is not necessary to manually scan the entire conversation for the desired topic.
Due to the searchability of the conversation in text form is now possible to apply statistical tools to the data to extract business information.
In one embodiment of the invention the first series of speech data from the agent and second series of speech data from the client are received as a single sequence of voice data whereby the single sequence of voice data is a direct digital representation of the telephone conversation. One way of achieving this could be for the voice processing system to act as a third party in a conference call between said agent and said client.
In another embodiment of the invention the first series of speech data from the agent is received and sent to the client and the second series of speech data from the client is received and sent to the agent. The advantage with receiving the speech data separately from the agent and the client is that the voice processing system can distinguish with ease which data belongs to which party. The method may then advantageously comprise combining the first and second series of speech data to form a single sequence of voice data which represents the whole telephone conversation. Preferably the method further comprises the step of storing an identifier associated with each message indicating the originator of the message and/or indicating the start time of the message.
An advantageous use of the invention is to perform the voice recognition function at a time when performance of other functions in the voice processing system is not maximum that is when there is spare capacity in the voice processing system. Typically the voice recognition function would be performed at a time when performance of other functions in the voice processing system is substantially at a minimum that is when the system is off-line when the call centre closes for the night.
Preferably the voice data is stored before the conversion process takes place so that the conversion process can be performed off-line. This allows resources to perform tasks that may be performed at later times during off peak hours or at night when no conversations need to be recorded.
With present technology the conversion process is performed at a rate slower than in real time to allow the most accurate conversion process to be used to minimize the errors in the recognition. However for the majority of purposes the recognition accuracy need not be 100%.
In another embodiment the conversion step is performed before the storing step and the voice and text data are stored together on non-volatile recording medium. This keeps both forms of data together for easy reference. Advantageously the voice and text data are stored on a CD ROM.
The text data by itself is additionally or alternatively stored on non-volatile recording medium for easy reference. The text version only would suffice for many usages whereas a voice version would be necessary for legal purposes such as evidence in a court of law and to check the accuracy of the voice recognition.
In the preferred embodiment both the voice and text data are stored on a CD ROM and in another embodiment the text data is also stored in a volatile recording medium.
It should be appreciated that the invention is independent of the precise archite

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Voice recognition of telephone conversations does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Voice recognition of telephone conversations, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Voice recognition of telephone conversations will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2514915

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.