Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Patent
1997-08-13
1999-11-02
Hudspeth, David R.
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
704226, 379406, G10L 918
Patent
active
059787635
DESCRIPTION:
BRIEF SUMMARY
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to voice activity detection.
2. Related Art
There are many automated systems that depend on the detection of speech for operation, for instance automated speech systems and cellular radio coding systems. Such systems monitor transmission paths from users' equipment for the occurrence of speech and, on the occurrence of speech, take appropriate action. Unfortunately transmission paths are rarely free from noise. Systems which are arranged simply to detect activity on the path may therefore incorrectly take action if there is noise present.
The usual noise that is present is line noise (i.e. noise that is present irrespective of whether or not a signal is being transmitted) and background noise from a telephone conversation, such as a dog barking, the sound of the television, the noise of a car's engine etc.
Another source of noise in communications systems is echo. For instance, echoes in a public switch telephone network (PSTN) are essentially caused by electrical and/or acoustic coupling e.g. at the four wire to two wire interface of a conventional exchange box; or the acoustic coupling in a telephone handset, from earpiece to microphone. The acoustic echo is time variant during a call due to the variation of the airpath, i.e. the talker altering the position of their head between the microphone and the loudspeaker. Similarly in telephone kiosks, the interior of the kiosk has a limited damping characteristic and is reverberant which results in resonant behaviour. Again this causes the acoustic echo path to vary if the talker moves around the kiosk or indeed with any air movement. Acoustic echo is becoming a more important issue at this time due to the increased use of hands free telephones. The effect of the overall echo or reflection path is to attenuate, delay and filter a signal.
The echo path is dependent on the line, switching route and phone type. This means that the transfer function of the reflection path can vary between calls since any of the line, switching route and the handset may change from call to call as different switch gear will be selected to make the connection.
Various techniques are known to improve the echo control in human-to-human speech communications systems. There are three main techniques. Firstly insertion losses may be added into the talker's transmission path to reduce the level of the outgoing signal. However the insertion losses may cause the received signal to become intolerably low for the listener. Alternatively, echo suppressors operate on the principle of detecting signal levels in the transmitting and receiving path and then comparing the levels to determine how to operate switchable insertion loss pads. A high attenuation is placed in the transmit path when speech is detected on the received path. Echo suppressors are usually used on longer delay connections such as international telephony links where suitable fixed insertion losses would be insufficient.
Echo cancellers are voice operated devices which use adaptive signal processing to reduce or eliminate echoes by estimating an echo path transfer function. An outgoing signal is fed into the device and the resulting output signal subtracted from the received signal. Provided that the model is representative of the real echo path, the echo should theoretically be cancelled. However, echo cancellers suffer from stability problems and are computationally expensive. Echo cancellers are also very sensitive to noise bursts during training.
One example of an automated speech system is the telephone answering machine, which records messages left by a caller. Generally, when a user calls up an automated speech system, a prompt is played to the user which prompt usually requires a reply. Thus an outgoing signal from the speech system is passed along a transmission line to the loudspeaker of a user's telephone. The user then provides a response to the prompt which is passed to the speech system which then takes appropriate action.
It has been prop
REFERENCES:
patent: 4192979 (1980-03-01), Jankowski
patent: 4410763 (1983-10-01), Strawczynski et al.
patent: 4897832 (1990-01-01), Suzuki et al.
patent: 4914692 (1990-04-01), Hartwell et al.
patent: 5125024 (1992-06-01), Gokcen et al.
patent: 5155760 (1992-10-01), Johnson et al.
patent: 5434916 (1995-07-01), Hasegawa
patent: 5475791 (1995-12-01), Schalk et al.
patent: 5577097 (1996-11-01), Meek
patent: 5619566 (1997-04-01), Fogel
patent: 5765130 (1998-06-01), Nguyen
Harry Newton, "Newton's Telecom Dictionary," Flatiron Publishing, Nov. 1994, pp. 462 and 519, Nov. 1994.
Thomas W. Parsons, "Voice and Speech Processing," McGraw-Hill, Inc., New York, 1987, pp. 125-127, 293-297, 1987.
Patent Abstracts of Japan, vol. 013, No. 468 (E-834), Oct. 23, 1989 & JP, A,01 183232 (OKI Electric Ind Co Ltd), Jul. 21, 1989, see abstract.
IEEE Transactions on Communications, vol. COM-20, No. 1, Feb. 1972, US, XP000565246 Fariello: "A novel digital speech detector for improving effective satellite capacity" see paragaraph 1.
British Telecommunications public limited company
Hudspeth David R.
Storm Donald L.
LandOfFree
Voice activity detection using echo return loss to adapt the det does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Voice activity detection using echo return loss to adapt the det, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Voice activity detection using echo return loss to adapt the det will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2149523