Patent
1994-04-12
1997-08-05
MacDonald, Allen R.
395 245, G10L 506, G10L 900
Patent
active
056550588
ABSTRACT:
A method for segmenting audio data, comprising speech from a plurality of individual speakers, according to speaker is provided. The method comprises providing individual HMMs for each individual speaker, each individual HMM including at least one state, and constructing a speaker network HMM by connecting the individual HMMs in parallel. The audio data is then divided into segments by determining a most likely sequence of states through the speaker network HMM, each of the segments being associated with one of the individual HMMs. Afterward, the speaker of each of the segments is identified. The segmented data may be used to form an index into the audio data according to speaker.
REFERENCES:
Gish et al., "Segregation of Speakers for Speech Recognition and Speaker Identification," Proc. Int. Conf. Acoustics, Speech and Signal Processing, May 1991, vol. 2 pp. 873-976.
Siu et al., "An Unsupervised Sequential Learning Algorithm for the Segmentation of Speech Waveforms with Multiple Speakers," Proc. Int. Conf. Acoustics, Speech and Signal Processing, Mar. 1992, vol. 2 pp. 189-192.
Sugiyama et al., "Speech Segmentation and Clustering Based on Speaker Features," Proc. Int. Conf. Acoustics, Speech and Signal Processing, Apr. 1993, vol. 2, pp. 395-398.
Matsui et al., "Comparison of Text-Independent Speaker Recognition Methods Using VQ-Distortion and Discrete/Continuous HMMs," Proc. Int. Conf. Acoustics, Speech and Signal Processing, Mar. 1992, vol. 2, pp. 157-160.
Balasubramanian Vijay
Chen Francine R.
Chou Philip A.
Kimber Donald G.
Poon Alex D.
Dorvic Richemond
Hurt Tracy L.
Jacobs R. Christine
MacDonald Allen R.
Xerox Corporation
LandOfFree
Segmentation of audio data for indexing of conversational speech does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Segmentation of audio data for indexing of conversational speech, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Segmentation of audio data for indexing of conversational speech will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-1080865