Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
1999-06-30
2010-06-15
Han, Qi (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
C704S270100, C704S272000
Reexamination Certificate
active
07739114
ABSTRACT:
Speakers are automatically identified in an audio (or video) source. The audio information is processed to identify potential segment boundaries. Homogeneous segments are clustered substantially concurrently with the segmentation routine, and a cluster identifier is assigned to each identified segment. A segmentation subroutine identifies potential segment boundaries using the BIC model selection criterion. A clustering subroutine uses a BIC model selection criterion to assign a cluster identifier to each of the identified segments. If the difference of BIC values for each model is positive, the two clusters are merged.
REFERENCES:
patent: 5659662 (1997-08-01), Wilcox et al.
patent: 5897616 (1999-04-01), Kanevsky et al.
patent: 5930748 (1999-07-01), Kleider et al.
patent: 6345252 (2002-02-01), Beigi et al.
patent: 6345253 (2002-02-01), Viswanathan
patent: 6421645 (2002-07-01), Beigi et al.
patent: 6424946 (2002-07-01), Tritschler et al.
patent: 07-287969 (1995-10-01), None
patent: 2000-298498 (2000-10-01), None
Scott Shaobing Chen et al. “speaker, Environment and Channel Change Detection and Cluster via the Bayesian Information Criterion,” proceedings of the DARPA broadcast news transcription and understanding workshop, Lansdowne, VA, Feb. 8-11, 1998.
Scott Shaobing Chen et al. “clustering via the Bayesian information criterion with applications in speech recognition,” Acoustics,Speech and Signal Processing, 1998, proceed 1998 IEEE international Conference on p. 645-648 vol. 2, May 12-15, 1998.
S. Dharanipragada et al., “Experimental Results in Audio Indexing,” Proc. ARPA SLT Workshop, (Feb. 1996).
L. Polymenakos et al., “Transcription of Broadcast News—Some Recent Improvements to IBM's LVCSR System,” Proc. ARPA SLT Workshop, (Feb. 1996).
R. Baksi, “Transcription of Broadcast News Shows with the IBM Large Vocabulary Speech Recognition System,” Proc. ICASSP98, Seattle, WA (1998).
H. Beigi et al., “A Distance Measure Between Collections of Distributions and its Application to Speaker Recognition,” Proc. ICASSP98, Seattle, WA (1998).
S. Chen, “Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion,” Proceedings of the Speech Recognition Workshop (1998).
S. Chen et al., “Clustering via the Bayesian Information Criterion with Applications in Speech Recognition,” Proc. ICASSP98, Seattle, WA (1998).
S, Chen et al., “IBM's LVCSR System for Transcription of Broadcast News Used in the 1997 Hub4 English Evaluation,” Proceedings of the Speech Recognition Workshop (1998).
S. Dharanipragada et al., “A Fast Vocabulary Independent Algorithm for Spotting Words in Speech,” Proc. ICASSP98, Seattle, WA (1998).
J. Navratil et al., An Efficient Phonotactic-Acoustic system for Language Identification, Proc. ICASSP98, Seattle, WA (1998).
G. N. Ramaswamy et al., “Compression of Acoustic Features for Speech Recognition in Network Environments,” Proc. ICASSP98, Seattle, WA (1998).
S. Chen et al., “Recent Improvements to IBM's Speech Recognition System for Automatic Transcription of Broadcast News,” Proceedings of the Speech Recognition Workshop (1999).
S. Dharanipragada et al., “Story Segmentation and Topic Detection in the Broadcast News Domain,” Proceedings of the Speech Recognition Workshop (1999).
C. Neti et al., “Audio-Visual Speaker Recognition for Video Broadcast News,” Proceedings of the Speech Recognition Workshop (1999).
Foote et al., “Finding Presentations in Recorded Meetings Using Audio and Video Features,” Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 6, pgs. pp. 3029-3032 (Mar. 15, 1999).
Nakagawa et al., “Segmentation of Continuous Speech by HMM and Bayesian Probability,” Transactions of the Institute of Electronics, Information and Communication Engineers, D-II, vol. J72 D-II, No. 1, pp. 1-10 (Jan. 1989).
Wilcox et al., “Segmentation of Speech Using Speaker Identification,” Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 1-161-1-164 (Apr. 19, 1994).
Ozawa Kazunori, “Voice Signal Encoding and Decoding Method, Voice Signal Encoder and Voice Signal Decoder,” Patent Abstracts of Japan, Publication No. 01-257999 (Oct. 16, 1989).
Hanada Eisuke, “Vice Encoding System,” Patent Abstracts of Japan, Publication No. 03-198099 (Aug. 29, 1991).
Eberman et al., “Signal Segmentalization Method Basing Cluster Constitution,” Patent Abstracts of Japan, Publication No. 10-105187 (Apr. 24, 1998).
Patent Abstracts of Japan, Publication No. 59-111699 (Jun. 27, 1984).
Chen Scott Shaobing
Tritschler Alain Charles Louis
Viswanathan Mahesh
Han Qi
International Business Machines - Corporation
Ryan & Mason & Lewis, LLP
LandOfFree
Methods and apparatus for tracking speakers in an audio stream does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods and apparatus for tracking speakers in an audio stream, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and apparatus for tracking speakers in an audio stream will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4243035