Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2005-02-24
2011-10-11
Chawan, Vijay B. (Department: 2626)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S206000, C704S208000, C704S211000, C704S214000, C704S223000, C704S222000, C704S221000, C704S219000, C704S216000, C704S227000, C704S245000, C704S256000, C704S258000, C704S266000, C704S270000, C704S500000, C700S094000, C084S609000, C084S622000
Reexamination Certificate
active
08036884
ABSTRACT:
The present invention provides a method, a computer-software-product and an apparatus for enabling a determination of speech related audio data within a record of digital audio data. The method comprises steps for extracting audio features from the record of digital audio data, for classifying one or more subsections of the record of digital audio data, and for marking at least a part of the record of digital audio data classified as speech. The classification of the digital audio data record is performed on the basis of the extracted audio features and with respect to at least one predetermined audio class. The extraction of the at least one audio feature as used by a method according to the invention comprises steps for partitioning the record of digital audio data into adjoining frames, defining a window for each frame which is formed by a sequence of adjoining frames containing the frame under consideration, determining for the frame under consideration and at least one further frame of the window a spectral-emphasis-value which is related to the frequency distribution contained in the digital audio data of the respective frame, and assigning a presence-of-speech indicator value to the frame under consideration based on an evaluation of the differences between the spectral-emphasis-values determined for the frame under consideration and at least one further frame of the window.
REFERENCES:
patent: 4797926 (1989-01-01), Bronson et al.
patent: 5008941 (1991-04-01), Sejnoha
patent: 5574823 (1996-11-01), Hassanein et al.
patent: 5664052 (1997-09-01), Nishiguchi et al.
patent: 5680508 (1997-10-01), Liu
patent: 5712953 (1998-01-01), Langs
patent: 5761642 (1998-06-01), Suzuki et al.
patent: 5808225 (1998-09-01), Corwin et al.
patent: 5825979 (1998-10-01), Tsutsui et al.
patent: 5828994 (1998-10-01), Covell et al.
patent: 5933803 (1999-08-01), Ojala
patent: 6041297 (2000-03-01), Goldberg
patent: 6377915 (2002-04-01), Sasaki
patent: 6424938 (2002-07-01), Johansson et al.
patent: 6570991 (2003-05-01), Scheirer et al.
patent: 6678654 (2004-01-01), Zinser et al.
patent: 6678655 (2004-01-01), Hoory et al.
patent: 6836761 (2004-12-01), Kawashima et al.
patent: 6859773 (2005-02-01), Breton
patent: 6873953 (2005-03-01), Lennig
patent: 7363218 (2008-04-01), Jabri et al.
patent: 2003/0101050 (2003-05-01), Khalil et al.
patent: 2003/0236663 (2003-12-01), Dimitrova et al.
patent: 2006/0080090 (2006-04-01), Ramo et al.
patent: 2007/0163425 (2007-07-01), Tsui et al.
patent: 2008/0201150 (2008-08-01), Tamura et al.
patent: 2009/0089063 (2009-04-01), Meng et al.
patent: 2009/0171485 (2009-07-01), Sim et al.
patent: 2010/0042408 (2010-02-01), Malah et al.
patent: 2010/0057476 (2010-03-01), Sudo et al.
patent: 2010/0198587 (2010-08-01), Ramabadran et al.
M. Heldner: “Spectral Emphasis as an Additional Source of Information in Accent Detection” Prosody in Speech Recognition and Understanding, ISCA Prosody 2001, 'Online! Oct. 22, 2001-Oct. 24, 2001, XP002290439.
Han K-P et al: “Genre Classification System of TV Sound Signals Based on a Spectrogram Analysis” IEEE Transactions on Consumer Electronics, IEEE Inc. New York, US, vol. 44, No. 1, Feb. 1, 1998, pp. 33-42, XP000779248.
El-Maleh K et al: “Speech/Music Discrimination for Multimedia Applications” 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. (ICASSP). Istanbul, Turkey, Jun. 5-9, 2000, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New York, NY: IEEE, US, vol. 4 of 6, Jun. 5, 2000, pp. 2445-2448, XP000993729.
Joseph P. Campbell, Jr., “Speaker Recognition: A Tutorial”, Proceedings of the IEEE, vol. 85, No. 9, Sep. 1997, pp. 1437-1462.
Lie Lu, et al., “Content Analysis for Audio Classification and Segmentation”, IEEE Transactions on Speech and Audio Processing, vol. 10, No. 7, Oct. 2002, pp. 504-516.
Jitendra Ajmera, et al., “Robust HMM-Based Speech/Music Segmentation”, ICASSP, 2002, 4 pages.
Lie Lu, et al., “A Robust Audio Classification and Segmentation Method”, Proceedings of the Ninth ACM International Conference on Multimedia, 2001, 9 pages.
Lam Yin Hay
Sola I Caros Josep Maria
Chawan Vijay B.
Colucci Michael
Oblon, Spivak McClelland, Maier & Neustadt, L.L.P.
Sony Deutschland GmbH
LandOfFree
Identification of the presence of speech in digital audio data does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Identification of the presence of speech in digital audio data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Identification of the presence of speech in digital audio data will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4287183