Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-10-05
2002-04-02
Korzuch, William (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S270000, C379S088060, C379S090010
Reexamination Certificate
active
06366879
ABSTRACT:
TECHNICAL FIELD OF THE INVENTION
This invention relates to performance monitoring of an interactive voice response system and in particular to calculating metrics for the real time responsiveness of interactive voice response systems.
BACKGROUND OF THE INVENTION
Interactive Voice Response (IVR) systems are an important tool in the Customer Relationship Management (CRM) arena. The IVR system plays a critical role in providing information and features to existing, and potential, business customers who use the telephone as an interface to an enterprise. The usability, responsiveness and performance of solutions using IVR applications is paramount, since these customers will experience the IVR as a “front-office” of a business.
The telephone is the most basic of I/O devices, as the only output is the delivery of speech segments played to the caller. A poor performing IVR platform or voice application will be very apparent to the telephone caller, and will cause frustration, with the potential for an unexpected transfer to a human agent, or worse, a lost call and possibly lost business. Measuring the end-user responsiveness of an IVR application in a realistic, high-volume situation is difficult, whether during acceptance testing or afterwards, when in service.
All IVR applications play voice segments, or prompts, to the caller. In fact, a large percentage of time in a typical telephone call to an IVR is taken up with the playing of voice segments. High capacity IVR systems (with many hundreds of telephony ports) or clusters of IVR systems grouped together in a Single System Image complex (where voice data resides on a separate database server) have the potential to become I/O bound because of the higher volume of voice data being played. This could lead to a degradation in responsiveness.
A particular problem in IVR systems is ‘underun’ where there is a delay between play of consecutive chunks of a voice segment; delays of over 200 msec are quite noticeable to the human ear. Previously, external devices have been used to listen to telephony channels to measure expected tones and underrun delays. These devices look for the absence of sound or the gap between the chunks of voice but cannot measure when underrun is about to occur. When underrun does occur and is detected by the ‘listener’ it is too late to do anything about it.
This disclosure describes how new IVR metrics can be used to characterise an IVR application, to facilitate performance evaluation, to monitor an IVR system when in production and to use the evaluation to improve the performance.
DISCLOSURE OF THE INVENTION
A method of controlling system performance in an interactive voice response system, said voice response system including a voice device driver, a voice segment stored in a file in a directory in a standard operating system format; a buffer for storing the voice segment prior to sending to the voice device driver, a plurality of voice channels for output of the voice segment, said method comprising the steps of: requesting a sequence of voice blocks be sent to a buffer, said sequence being one of a plurality of sequences making up a voice segment; determining the number of voice blocks sent from the file to the buffer; requesting the device driver to initiate play of the sequence; calculating the play period of the sequence; calculating a next underrun time when the sequence will finish playing based on the initial request time and the play period; calculating the margin period between the calculated next underrun time and the actual time after a further sequence of voice data blocks is sent to the buffer in response to a device driver request for said further sequence of voice data blocks; and shutting down a telephony channel when the margin period exceeds a defined threshold, to attempt a reduction in the application load, and to bring back to within acceptable limits.
The margin between the next underrun time and the time when the device driver requests a further chunk of voice is called the ‘Underun Margin Time’ or UMT and is an important performance evaluation metric. As one increases the number of channels available on an IVR system there is a degradation in the overall performance of the IVR system because the processor must share it's resources between an increased number of channels. The UMT is shown against the number of channels in FIG.
1
A and it can be seen that the UMT decreases as the number of channels increases. While the UMT is positive there is no underrun and the system is operating effectively. When the UMT is zero there is no underrun but the system has no margin. When the UMT is negative underrun is occurring and there are regular delays when the voice segments are played. So that the IVR system is not being overloaded it is best to chose the number of channels so that the UMT is positive and not close to zero.
It is also important to see how the performance of the system varies as a function of the number of channels in use.
FIG. 1B
illustrates where the function is not be a simple one and consideration should be made of the shape of the graph otherwise there could be a tendency to set the proposed number of operational channels to too high a value. The UMT can be seen to ‘plateau’ over a range of channels so that underrun could occur with the same regularity when the channel number was on the ‘plateau’. For instance, an ‘unsafe’ number of channels is marked on the figure as is an ‘optimum’ number of channels.
The time taken to play a single voice block is stored as a constant and the play time of a chunk is calculated by multiplying the number of voice blocks by the time taken to play a single voice block.
Advantageously the method further comprises the steps of calculating the time taken to fetch the sequence of voice blocks. This time taken is called the Play Latency time in the specification and indicates a degradation in performance of the IVR system if the value increases. A suitable way of calculating the time to fetch the sequence comprises: before requesting the voice blocks to be sent to the buffer, storing the current time; and after the voice blocks have been sent to the buffer and before requesting the device driver to play the voice sequence subtracting the new current time from the stored current time. In this way a consistent measurement can be taken free of complication from the other processes of the evaluation. The Play Latency Time (PLT) is another useful evaluation metric, in
FIG. 1C
it can be seen that PLT increases with the number of channels that are in use. Too many channels and the high PLT would indicate a poorly performing IVR with too much time taken to fetch the voice sequences. The number of channels should be set to below the ‘unsafe’ number for a lower PLT and better performing IVR.
The buffer is suitably the operating system buffer which allows the measurements to be taken at an operating system level . Alternatively the buffer could be the device driver buffer which allows a further level of performance evaluation but necessitates some device driver interaction not described in this specification.
The number of voice blocks sent to the buffer is included in the voice block header which accompanies the voice block data, this allows fast acquisition of the length of the voice block. The voice block data is sent to the buffer using operating system I/O routines as the voice segment is stored as a file in a directory in a standard operating system format. This allows simple implementation of the embodiment.
The play time of the sequence is calculated by multiplying the number of data blocks sent to the buffer by a constant, the play time of a compressed data block or an uncompressed data block.
REFERENCES:
patent: 5493608 (1996-02-01), O'Sullivan
patent: 5771276 (1998-06-01), Wolf
patent: 0 697 780 (1996-02-01), None
patent: 0 774 853 (1997-05-01), None
Coxhead Philip Randall
Jones Nigel Lewis
Martin David G
Nagchowdhury Sanjay
Herndon Jerry W.
International Business Machines Corp.
Korzuch William
McFadden Susan
LandOfFree
Controlling interactive voice response system performance does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Controlling interactive voice response system performance, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Controlling interactive voice response system performance will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2842105