Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Patent
1998-03-19
2000-09-12
Hudspeth, David R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
704232, 704202, 706 25, 348742, G01L 1506
Patent
active
061190837
DESCRIPTION:
BRIEF SUMMARY
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to the classification of data which can be used to train a trainable process. It is of application to the assessment of signals carried by a telecommunications system, for example to assess the condition of telecommunications systems whilst in use. Embodiments will be described of application to audio signals carrying speech, and to video signals.
2. Related Art
Signals carried over telecommunications links can undergo considerable transformations, such as digitisation, data compression, data reduction, amplification, and so on. All of these processes can distort the signals. For example, in digitising a waveform whose amplitude is greater than the maximum digitisation value, the peaks of the waveform will be converted to a flat-topped form (a process known as peak clipping). This adds unwanted harmonics to the signal. Distortions can also be caused by electromagnetic interference from external sources.
Many of the distortions introduced by the processes described above are non-linear, so that a simple test signal may not be distorted in the same way as a complex waveform such as speech, or at all. For a telecommunications link carrying data it is possible to test the link using all possible data characters; e.g. the two characters 1 and 0 for a binary link, the twelve tone-pairs used in DTMF (dual tone multi-frequency) systems, or the range of "constellation points" used in a QAM (Quadrature Amplitude Modulation) system. However an analogue signal does not consist of a limited number of well-defined signal elements, but is a continuously varying signal. For example, a speech signal's elements vary according not only to the content of the speech (and the language used) but also the physiological and psychological characteristics of the individual talker, which affect characteristics such as pitch, volume, characteristic vowel sounds etc.
It is known to test telecommunications equipment by running test sequences using samples of the type of signal to be carried. Comparison between the test sequence as modified by the equipment under test and the original test sequence can be used to identify distortion introduced by the equipment under test. However, these arrangements require the use of a pre-arranged test sequence, which means they cannot be used on live telecommunications links--that is, links currently in use--because the test sequence would interfere with the traffic being carried and be perceptible to the users, and also because the live traffic itself (whose content cannot be predetermined) would be detected by the test equipment as distortion of the test signal.
In order to carry out tests on equipment in use, without interfering with the signals being carried by the equipment (so-called non-intrusive testing), it is desirable to carry out the tests using the live signals themselves as the test signals. However, a problem with using a live signal as the test signal is that there is no instantaneous way of obtaining, at the point of measurement, a sample of the original signal. Any means by which the original signal might be transmitted to the measurement location would be as subject to similar distortions as the link under test.
The present Applicant's co-pending International Patent applications WO96/06495 and WO96/06496 (both published on Feb. 29th 1996) propose two possible solutions to this problem. WO96/06495 describes the analysis of certain characteristics of speech which are talker-independent in order to determine how the signal has been modified by the telecommunications link. It also describes the analysis of certain characteristics of speech which vary in relation to other characteristics, not themselves directly measurable, in a way which is consistent between individual talkers, and which may therefore be used to derive information about these other characteristics. For example, the spectral content of an unvoiced fricative varies with volume (amplitude), but in a manner independent of the individual talker.
REFERENCES:
patent: 4860360 (1989-08-01), Boggs
patent: 4972484 (1990-11-01), Theile et al.
patent: 5301019 (1994-04-01), Citta
patent: 5621854 (1997-04-01), Hollier
patent: 5630019 (1997-05-01), Kochi
Yogeshwar et al. (A New Perceptual Model for Video) Rutgers University, NJ. pp. 188-193, 1990.
Bellini et al. (Analog Fuzzy Implementation of a Perceptual Classifier for Videophone Sequences) Universita di Bologna, Italy, pp. 787-794, Jul. 1996.
IEEE Int Conf on Communications--Session 33.3, vol. 2, Jun. 7-10, 1987, Seattle, US, pp. 1164-1171, Quincy, "Prolog-Based Expert Pattern Recognition System Shell for Technology Independent, User-Oriented Classification of Voice Transmission Quality".
IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, Jun. 1-2, 1989, Victoria, CA, Kuichek et al, "Speech Quality Assessment Using Expert Pattern Recognition Techniques".
Patent Abstracts of Japan, vol. 17, No. 202 (E-1353), Apr. 20, 1993 & JP-A-04 345327 (Nippon Telegr&Teleph Corp), Dec. 1, 1992.
Beerends, "A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation", A. Audio Eng. Soc., vol. 40, No. 12, 1992, pp. 963-978.
Brandenburg et al, "NMR and Masking Flag", Evaluation of Quality Using Perceptual Criteria, AES 11.sup.th International Conference, pp. 169-179, 1992.
Zwicker et al, "Audio Engineering and Psychoacoustics: Matching Signals to the Final Receiver, the Human Auditory System", J. Audio Eng. Soc., vol. 39, No. 3, 1991, pp. 115-126.
Irii et al, "Objective Measurement Method for Estimating Speech Quality of Low-Bit-Rate Speech Coding", NTT Review, vol. 3, No. 5, Sep. 1991, pp. 79-87.
Dimolitsas et al, "Objective Speech Distortion Measures and Their Relevance to Speech Quality Assessments", IEE Proceedings, vol. 136, Pt. 1, No. 5, Oct. 1989, pp. 317-324.
Herre et al, "Analysis Tool for Realtime Measurements Using Perpetual Criteria", AES 11.sup.th International Conference, 1992.
Kalittsev, "Estimate of the Information Content of Speech Signals", 1298 Telecommunications and Radio Engineering 47 (1992), Jan., No. 1, New York, US, pp. 11-15.
Moore et al, "Suggested Formulae For Calculating Auditor-Filter Bandwidths and Excitation Patterns", J. Acoust. Soc. Am, 74 (3), Sep. 1983, pp. 750-753.
Gierlich, "New Measurement Methods for Determining the Transfer Characteristics of Telephone Terminal Equipment", Proceedings of 1992, IEEE International Symposium on Circuits and Systems, May 10-13, 1992, San Diego (US), New York (US), vol. 4, pp. 2069-2072.
Sobolev, "Estimation of Speech Signal Transmission Quality from Measurements of Its Spectral Dynamics",Telecommunications and Radio Engineering, vol. 47, No. 1, Jan. 1992, Washington, US, pp. 16-21, XP000316414.
Gray Philip
Hollier Michael P
Azad Abul K.
British Telecommunications public limited company
Hudspeth David R.
LandOfFree
Training process for the classification of a perceptual signal does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Training process for the classification of a perceptual signal, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Training process for the classification of a perceptual signal will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-104903