Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-09-27
2002-06-11
Korzuch, William (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C084S616000, C381S002000
Reexamination Certificate
active
06405163
ABSTRACT:
BACKGROUND OF THE INVENTION
The invention relates to the now very popular field of karaoke entertaining. In karaoke a (usually amateur) singer performs live in front of an audience with background music. One of the challenges of this activity is to come up with the background music, i.e. get rid of the original singer's voice to retain only the instruments so the amateur singer's voice can replace that of the original singer. A very inexpensive (but somewhat unsophisticated) way in which this can be achieved consists of using a stereo recording and making the assumption (usually true) that the voice is panned in the center (i.e. that the voice was recorded in mono and added to the left and right channels with equal level). In that case the voice can be significantly reduced by subtracting the left channel from the right channel, resulting in a mono recording from which the voice is nearly absent (because stereo reverberation is usually added after the mix a faint reverberated version of the voice is left in the difference signal). There are several drawbacks to this technique:
1) The output signal is always monophonic. In other words it is not possible using this standard technique to recover a stereo signal from which the voice has been removed.
2) More often than not, other instruments are also panned in the center (bass guitar, bass drum, horns and so on), and the standard technique will also remove them, which is undesirable.
The standard method does not allow extracting or amplifying the voice in the original recording: it is sometimes very useful to be able to remove the background instruments from the original recording and retain only the voice (for example, to change the mixing level of the voice or to aid a pitch-extraction system targeted at the voice).
SUMMARY OF THE INVENTION
According to one aspect of the present invention, a phase-vocoder removes the voice or the background instruments from a stereo recording while retaining a stereo output signal. Furthermore, because of the frequency-domain nature of the phase-vocoder, it is possible to more effectively discriminate, based on their frequency contents, the voice from other instruments also panned in the center.
According to a further aspect of the invention, peak frequencies are determined where the magnitude of the frequency domain spectra is at a maximum.
According to another aspect of the invention, a difference spectra is derived from the frequency domain spectra of the left and right stereo channels at the peak frequencies. An attenuating gain factor for each peak frequency is then calculated which is a function of the magnitude of the difference spectra at the peak frequency. For frequencies of voice signals, or other signals panned to center, the magnitude of difference spectra will be much less than that of the left or right channels.
According to another aspect of the invention, a modified spectra is derived by multiplying the magnitude of the frequency domain spectra by the attenuating gain factor at each peak frequency. The magnitude of the modified spectra at frequencies for voice, or other signals panned to center, will be small.
According to another aspect of the invention, the attenuation gain is set to unity for frequency components outside the voice range so that non-voice music panned to center is not attenuated.
According to another aspect of the invention, regions of influence are defined about each peak frequency. The magnitude of the frequency spectra within each region of influence is multiplied by the gain factor for the peak frequency.
According to another aspect of the invention, frequencies of voice, or of other signals panned to center, are amplified by utilizing an amplifying gain factor inversely proportional to the magnitude of the gain factor at each peak frequency. For example, the amplifying gain factor can be set equal to the difference of one and the attenuating gain factor.
Other features and advantages of the invention will be apparent in view of the following detailed description and appended drawings.
REFERENCES:
patent: 5400410 (1995-03-01), Muraki et al.
patent: 5511128 (1996-04-01), Lindemann et al.
patent: 5541999 (1996-07-01), Hirai
patent: 5550920 (1996-08-01), Nomura
patent: 5666424 (1997-09-01), Fosgate et al.
patent: 5719344 (1998-02-01), Pawate
patent: 5727068 (1998-03-01), Karagosian et al.
patent: 5778082 (1998-07-01), Chu et al.
patent: 5890125 (1999-03-01), Davis et al.
patent: 5946352 (1999-08-01), Rowlands et al.
patent: 6021386 (2000-02-01), Davis et al.
patent: 6148086 (2000-11-01), Ciullo et al.
patent: 6311155 (2001-10-01), Vaudrey et al.
International Search Report, ISA/US, Feb. 6, 2001, 6 pages.
“Two Microphone Nonlinear Frequency Domain Beamformer for Hearing Aid Noise Reduction,” Lindemann, InProc. IEEE ASASP Workshop on app. of sig. proc. to audio and acous., New Paltz NY 1995.
Creative Technology Ltd.
Korzuch William
Lerner Martin
Townsend and Townsend / and Crew LLP
LandOfFree
Process for removing voice from stereo recordings does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Process for removing voice from stereo recordings, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Process for removing voice from stereo recordings will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2917925