Method and apparatus for voice annotation and retrieval of...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for voice annotation and retrieval of... Method and apparatus for voice annotation and retrieval of...

: 1999-06-04
: 2002-05-28
: Dorvil, Richemond (Department: 2641)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S257000
: Reexamination Certificate
: active
: 06397181
: ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to databases, and in particular to systems for conveniently creating, indexing and retrieving media content including audio, image and video data and other time-sequence data, from a repository of media content.
BACKGROUND
With the advent of the Internet and the proliferation of digital multimedia technology, vast amounts of digital media content are readily available. The digital media content can be time-sequence data including audio and video data. Databases of such digital media content have grown to voluminous proportions. However, tools for conveniently and effectively storing such data for later retrieval and retrieving the data have not kept abreast of the development in the volume of such data.
Attempts have been made to manage databases of video data. However, such systems are characterised by being difficult to achieve automatic and convenient indexing and retrieval of media information. Further, such systems typically have a low level of retrieval accuracy. Therefore, a need clearly exists for an improved system of indexing and retrieving media content.
SUMMARY
In accordance with a first aspect of the invention, there is disclosed a method of voice annotating digital media data. The method includes the steps of: speech annotating one or more portions of the digital media data; and indexing the digital media data and speech annotation to provide indexed media content.
Preferably, the method also includes the step of creating a word lattice using the speech annotation. It may also include the step of recording the speech annotation separately from the digital media data. Optionally, the speech annotation is generated using a formal language. Further, the annotating step can be dependent upon at least one of a customised vocabulary and Backus-Naur Form grammar. Still further, the step of creating the word lattice may be dependent upon at least one of acoustic and linguistic knowledge.
Preferably, the method includes the step of reverse indexing the word lattice to provide a reverse index table. It may also include the step of content addressing the reverse index table.
In accordance with a second aspect of the invention, there is disclosed an apparatus for voice annotating digital media data. The apparatus includes: a device for speech annotating one or more portions of the digital media data; and a device for indexing the digital media data and speech annotation to provide indexed media content.
In accordance with a third aspect of the invention, there is disclosed a computer program product having a computer readable medium having a computer program recorded therein for voice annotating digital media data. The computer program product includes: a module for speech annotating one or more portions of the digital media data; and a module for indexing the digital media data and speech annotation to provide indexed media content.
In accordance with a fourth aspect of the invention, there is disclosed a method of voice retrieving digital media data annotated with speech. The method includes the steps of: providing indexed digital media data, the indexed digital media data derived from a word lattice created from speech annotation of the digital media data; generating a speech query; and retrieving one or more portions of the indexed digital media data dependent upon the speech query.
Preferably, the method further includes the step of creating a word lattice from the speech query. The word lattice may be created dependent upon at least one of acoustic and linguistic knowledge. The method may also include the step of searching the indexed media data dependent upon the speech query by matching the word lattice created from the speech query with word lattices of the indexed media data. It may also include the step of confidence filtering the lattice created from the speech query to produce a short-list for the searching step.
Optionally, the method further includes the step of searching the indexed digital media data dependent upon a text query. Further, the speech query can generated dependent upon at least one of a customised vocabulary and Backus-Naur Form grammar.
In accordance with a fifth aspect of the invention, there is disclosed an apparatus for voice retrieving digital media data annotated with speech. The apparatus includes: a device for indexed digital media data, the indexed digital media data derived from a word lattice created from speech annotation of the digital media data; a device for generating a speech query; and a device for retrieving one or more portions of the indexed digital media data dependent upon the speech query.
In accordance with a sixth aspect of the invention, there is disclosed a computer program product having a computer readable medium having a computer program recorded therein for voice retrieving digital media data annotated with speech. The computer program product includes: a module for providing indexed digital media data, the indexed digital media data derived from a word lattice created from speech annotation of the digital media data; a module for generating a speech query; and a module for retrieving one or more portions of the indexed digital media data dependent upon the speech query.
In accordance with a seventh aspect of the invention, there is disclosed a system for voice annotating and retrieving digital media data. The system includes: a device for speech annotating at least one segment of the digital media data; a device for indexing the speech-annotated digital media data to provide indexed digital media data; a device for generating a speech or voice query; and a device for retrieving one or more portions of the indexed digital media data dependent upon the speech query.
Preferably, the system also includes a device for creating a lattice structure from speech annotation. This device can be dependent upon acoustic and/or linguistic knowledge.
Preferably, the speech-annotating device post-annotates the digital media data. The speech annotation can be generated using a formal language.
The systems can also include a device for reverse indexing the lattice structure to provide a reverse index table. Still further, it may include a device for content addressing the reverse index table.
Preferably, the system includes a device for creating a lattice structure from the speech query. It may also include a device for searching the indexed digital media data dependent upon the speech query by matching the lattice structure created from the speech query with lattice structures of the indexed digital media data. The system may also include a device for confidence filtering the lattice structure created from the speech query to produce a short-list for the searching device. The lattice structure can be created dependent upon at least one of acoustic and linguistic knowledge. Still further, the system may include a device for searching the indexed digital media data dependent upon a text query.
Preferably, at least one of the annotating device and the speech query is dependent upon at least one of a customised vocabulary and Backus-Naur Form grammar.

REFERENCES:
patent: 5835667 (1998-11-01), Wactlar et al.
patent: 6185527 (2001-02-01), Petkovic
1999 IEEE 3rdWork on Multimedia Signal Processing. Maison et al., Audio-Visual speaker recognition for video broadcast news: some fusion techniques. pp. 161-167. Sep. 1999.*
ICIP 98 Proceedings. 1998 International Conference on Image Precessing. Tsekeridou et al., Speaker dependent video indexing based on audio-visual interaction. 358-362 vol. 1. Oct. 1998.*
ICASSP-97. 1997 IEEE International Conference on Acoustics, Speech and Signal Processing. Roy et al., “Speaker indentification based text to audio alignment for an audio retrieval system” pp. 1099-1102 vol. 2. Apr. 1997.

Affiliated with

Li Haizhou

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Narasimhalu Arcot Desai

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Wu Jiankang

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Dorvil Richemond

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kent Ridge Digital Labs

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Nath&Associates PLLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Novick Harold L.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for voice annotation and retrieval of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for voice annotation and retrieval of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for voice annotation and retrieval of... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2824362

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure