Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2000-12-18
2004-09-28
Dorvil, Richemond (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S245000
Reexamination Certificate
active
06799158
ABSTRACT:
FIELD OF THE INVENTION
The invention relates generally to digital data. More particularly, the invention relates to a method and system for generating a characteristic identifier for digital data and for detection of identical digital data.
BACKGROUND OF THE INVENTION
In recent years, an increasing amount of audio data is recorded, processed, distributed, and archived on digital media using numerous encoding and compression formats, such as WAVE, AIFF (Audio Interchange File Format), MPEG (Motion Picture Experts Group), and REALAUDIO. Transcoding or resampling techniques that are used to switch from one encoding format to another almost never produce a recording that is identical to a direct recording in the target format. A similar effect occurs with most compression schemes. Changes in the compression factor or other parameters result in a new encoding and a bit stream that bears little similarity to the original bit stream. Both effects make it rather difficult to establish the identity of one audio recording stored in two different formats. Establishing the possible identity of different audio recordings is a pressing need in audio production, archiving, and copyright protection.
During the production of a digital audio recording, usually numerous different versions in various encoding formats come into existence as intermediate steps. These different versions are distributed over a variety of different computer systems. In most cases, these recordings are not cross-referenced and often it has to be established by listening to the recordings whether two versions are identical or not. An automatic procedure will greatly ease this task.
A similar problem exists in audio archives that have to deal with material that has been issued in a variety of compilations (such as Jazz or popular songs) or on a variety of carriers (such as the famous recordings of Toscanini with the NBC Symphony orchestra). Often the archive version of the original master of such a recording is not documented and in most cases it can only be decided by listening to the audio recordings whether a track from a compilation is identical to a recording of the same piece on another sound carrier.
Copyright protection is a key issue for the audio industry. Copyright protection is even more relevant with the invention of new technology that makes creation and distribution of copies of audio recordings a simple task. While mechanisms to avoid unauthorized copies solve one side of the problem, it is also required to establish processes to detect unauthorized copies.
SUMMARY OF THE INVENTION
According to one aspect of the present invention, a characteristic identifier for digital data is generated. The information contained in the data is thereby reduced such that the resulting identifier is made comparable to another identifier. Identifiers generated according to the present invention are resistant against artifacts that are introduced into digital data by all common compression techniques. Using such identifiers therefore allows the identification of identical digital data independent of the chosen representation and compression methods.
Furthermore, the generated identifiers are used for detecting identical digital data. It is decided whether sets of digital data are identical depending on the distance between the identifiers belonging to them. A faster, cheaper and more reliable process of detection of identical digital data is established.
In a preferred embodiment of the present invention, the digital data is a digital audio signal and the characteristic identifier is called an audio signature. The comparison of identical audio data according to the invention can be carried out without a person actually listening to the audio data.
The present invention can be used to establish automated processes to find potential unauthorized copies of audio data, e.g., music recordings, and therefore enables a better enforcement of copyrights in the audio industry.
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings
REFERENCES:
patent: 4783754 (1988-11-01), Bauck et al.
patent: 5918223 (1999-06-01), Blum et al.
Foote, Jonathan, “An Overview of Audio Information Retrieval,” Dec. 18, 1997, jtfoote@bigfoot.com, pp. 1-18.*
Foote, J. T., “Content-Based Retrieval of Music and Audio,” in Kuo et al., editor, Multimedia Storage and Archiving Systems II, Proc. of SPIE, vol. 3229, pp. 138-147, 1997.*
Tzanetakis, et al. “A framework for audio analysis based on classification and temporal segmentation,” Sep. 8-10, 1999, in Proc. Euromicro, Workshop on Music Technology and Audio processing, Milan, Italy.*
Welsh, et al. “Querying Large Collections of Music for Similarity,” Nov. 1999, UC Berkely Technical Report UCB/CSD-00-1096.*
Wold et al., Content-Based Classification, Search, and Retrieval of Audio, IEEE Multimedia, vol. 3, Issue 3, Fall 1996, pp. 27-36.
Fischer Uwe
Hoffmann Stefan
Kriechbaum Werner
Stenzel Gerhard
Dorvil Richemond
Harper V. Paul
International Business Machines - Corporation
Percello, Esq. Louis J.
LandOfFree
Method and system for generating a characteristic identifier... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for generating a characteristic identifier..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for generating a characteristic identifier... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3223411