Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
2001-03-09
2003-08-05
Knepper, David D. (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S236000
Reexamination Certificate
active
06604072
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to signal recognition, and more specifically to a method for automatically identifying audio content such as a sound recording.
2. Description of Related Art
The development of efficient digital encoding methods for audio (e.g., the Motion Picture Experts Group Layer 3 standard known also as MP3), in combination with the advent of the Internet, has opened up the possibility for the entirely electronic sale and distribution of recorded music. This is a potential boon to the recording industry. On the downside, the technical advances also abet the illegal distribution of music. This poses a threat to the propriety interests of recording artists and music distributors. The ease of distributing high fidelity digital copies that do not degrade over successive generations is a far greater problem to the music industry than the limited copying of music onto audio cassettes that occurred prior to the advent of digital audio. Presently, there are a myriad of Internet sites from which a person can obtained bootleg copies of copyrighted music. Thus, for music copyright enforcement, there is a need for a system and method for the automated identification of audio content.
The identification of music from a digital audio file, such as an MP3 file, is not a trivial problem. Different encoding schemes will yield a different bit stream for the same song. Even if the same encoding scheme is used to encode the same song (i.e., sound recording) and create two digital audio files, the files will not necessarily match at the bit level. Various effects can lead to differentiation of the bit stream even though the resulting sound differences as judged by human perception are negligible. These effects include: subtle differences in the overall frequency response of the recording system, digital to analog conversion effects, acoustic environmental effects such as reverb, and slight differences in the recording start time. Further, the bit stream that results from the application of a given encoding scheme will vary depending on the type of audio source. For example, an MP3 file of a song created by encoding the output of a Compact Disc (CD) will not match at the bit level with an MP3 file of the same song created by encoding the output of a stereo receiver.
One solution that has been proposed is to tag copyrighted music by using digital watermarking. Unfortunately numerous methods have been discovered for rendering digital watermarks illegible. In addition, there are forms of noise and distortion that are quite audible to humans, but that do not impede our ability to recognize music. FM broadcasts and audio cassettes both have a lower bandwidth than CD recordings, but are still copied and enjoyed by some listeners. Likewise, many of the MP3 files on the Internet are of relatively low quality, but still proliferate and thus pose a threat to the profitability of the music industry. Furthermore, some intentional evasions of copyright protections schemes involve the intentional alteration or distortion of the music. These distortions include time-stretching and time-compressing. In such cases, not only may the start and stop times be different, but the song durations may be different as well. All such differences may be barely noticeable to humans, but can foil many conventional copyright protection schemes.
Another problem for the music industry and songwriters is the unauthorized use of samples. Samples are short sections of a song that have been clipped and placed into another song. Unless such a sample can be found and identified, the owner of the copyright on the original recording will not be fairly compensated for its use in the derivative work.
There is a need for a method that can identify audio content such as sound recordings despite subtle differences and alterations that arise during processes such as recording, broadcasting, encoding, decoding, transmission, and intentional alteration.
REFERENCES:
patent: 4450531 (1984-05-01), Kenyon et al.
patent: 4843562 (1989-06-01), Kenyon et al.
patent: 4918730 (1990-04-01), Schulze
patent: 5437050 (1995-07-01), Lamb et al.
patent: 5504518 (1996-04-01), Ellis et al.
“13.4 Power Spectrum Estimation Using the FFT,” Numerical Recipes in C, Cambridge University Press, 1993, pp. 549-558.
Sundberg, J., “The Science of Musical Sounds,” Academic Press, 1991, p. 89.
Germain, R., Califano, Andrea, Colville, S., “Fingerprint Matching Using Transformation Parameter Clustering,” IEEE Computational Science and Engineering, Oct.-Dec. 1997, vol. 4, No. 4, pp. 42-49.
Crawford, T., Iliopoulos, C.S., Raman, Rajeev, “String-Matching Techniques for Musical Similarity and Melodic Recognition,” Computing in Musicology 11, 1997-98, pp. 73-100.
Abrams Steven
Fitch Blake G.
Germain Robert S.
Pitman Michael C.
August Casey P.
Bongini Stephen
Fleit Kain Gibbons Gutman & Bongini P.L.
Knepper David D.
LandOfFree
Feature-based audio content identification does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Feature-based audio content identification, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Feature-based audio content identification will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3077830