Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1999-04-16
2002-08-13
Chawan, Vijay B (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S246000, C704S245000, C704S251000
Reexamination Certificate
active
06434520
ABSTRACT:
BACKGROUND
1. Technical Field
The present application relates generally to a system and method for managing an archive of audio data and, more particularly, to an audio processing system and method for segmenting and indexing audio or multimedia files based on audio information such as speaker identity, background and/or channel, for storage in a database, and an information retrieval system and method which utilizes the indexed audio information to search the database and retrieve desired segments of audio/multimedia files.
2. Description of the Related Art
In general, management of an archive is important for maximizing the potential value of the archive. Database management is especially challenging for owners of audio/multimedia archives due to the increasing use of digital media. Indeed, the continuing increase in consumer use of audio and multimedia recording devices for memorializing various events such as radio and television broadcasts, business meetings, lectures, and courtroom testimony, has resulted in a vast amount of digital information that the consumers desire to maintain in an audio/multimedia archive for subsequent recall.
This increasing volume of digital information compells database owners to continuously seek techniques for efficiently indexing and storing such audio data in their archives in some structured form so as to facilitate subsequent retrieval of desired information. Accordingly, a system and method for indexing and storing audio data, and an information retrieval system which provides immediate access to audio data stored in the archive through a description of the content of an audio recording, the identity of speakers in the audio recording, and/or a specification of circumstances surrounding the acquisition of the recordings, is desirable.
SUMMARY OF THE INVENTION
The present application is directed to a system and method for managing a database, of audio/multimedia data. In one aspect of the present invention, a system for managing a database of audio data files comprises:
a segments for dividing an input audio data file into segments by detecting speaker changes in the input audio data file;
speaker identifier for identifying a speaker of each segment and assigning at least one identity tag to each segment;
a speaker verifier for verifying the at least one identity tag of each segment; and
an indexer for indexing the segments of the audio data file for storage in a database in accordance with the identification tags of verified speakers.
In another aspect of the present invention, the system further comprises a search engine for retrieving one or more segments from the database by processing a user query based on an identity of a desired speaker.
In another aspect of the present invention, the system for managing a database of audio/multimedia files further indexes audio/multimedia files and data streams according to audio information such as, background environment (music, street noise, car noise, telephone, studio noise, speech plus music, speech plus noise, speech over speech), and channel (microphone, telephone) and/or the transcription of the spoken utterances, and the user may retrieve stored audio segments from the database by formulating queries based on one or more parameters corresponding to such indexed information.
These and other aspects, features and advantages of the present invention will be discussed and become apparent from the following detailed description of preferred embodiments, which is to be read in connection with the accompanying drawings.
REFERENCES:
patent: 3936805 (1976-02-01), Bringol et al.
patent: 5465290 (1995-11-01), Hampton et al.
patent: 5550966 (1996-08-01), Drake et al.
patent: 5598507 (1997-01-01), Kimber et al.
patent: 5606643 (1997-02-01), Balasubramanian et al.
patent: 5649060 (1997-07-01), Ellozy et al.
patent: 5655058 (1997-08-01), Balasubramanian et al.
patent: 5659662 (1997-08-01), Wilcox et al.
patent: 5737532 (1998-04-01), Delair et al.
patent: 5774841 (1998-06-01), Salazar et al.
patent: 5897616 (1999-04-01), Kanevsky et al.
patent: 5918223 (1999-06-01), Blum et al.
patent: 5937383 (1999-08-01), Ittycheriah et al.
patent: 5960399 (1999-09-01), Barclay et al.
patent: 6161090 (2000-12-01), Kanevsky et al.
patent: 6185527 (2001-02-01), Petkovic et al.
patent: 0507743 (1992-10-01), None
patent: WO9211634 (1992-07-01), None
Wilcox et al., (“HMM-Based Wordspotting for Voice Editing and Indexing”, 2nd European Conference on Speech Communication and Technology, Genova, Italy, Sep. 24-26, 1991, pp. 25-28).*
“Automatic Content-Based Retrieval of Broadcast News”, ACM Multimedia 95—Electronic Proceedings, Nov. 5-9, 1995, San Francisco, California.
Sugiyama, et al., “Speech Segmentation and Clustering Based on Speaker Features”, 1993 IEEE, pp. II-395-II-398.
Wilcox, et al., “Segmentation of Speech Using Speaker Identification”, 1994 IEEE, pp. I-161-I-164.
Cohen, et al., “Data Retrieval through a Compact Disk Drive having a Speech-Driven Interface”, IBM Technical Disclosure Bulletin, vol. 38, No. 01, Jan. 1995.
Kanevsky Dimitri
Maes Stephane H.
Chawan Vijay B
F. Chau & Associates LLP
International Business Machines - Corporation
LandOfFree
System and method for indexing and querying audio archives does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for indexing and querying audio archives, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for indexing and querying audio archives will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2879749