Image analysis – Applications
Reexamination Certificate
1999-11-09
2004-02-03
Patel, Jayanti K. (Department: 2625)
Image analysis
Applications
C713S176000, C380S210000
Reexamination Certificate
active
06687383
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to systems and methods for embedding audio information in pictures and video images.
2. Discussion of the Prior Art
Generally, in books, magazines, and other media that include still or picture images, there is no audio or sound that accompanies the still (picture) images. In the case of a picture of a seascape, for example, it would be desirable to provide for the viewer the accompaniment of sounds such as wind and ocean waves. Likewise, for a video image, there may be audio information embedded in a separate audio track for simultaneous playback, however, the video content itself does not contain any embedded sound information that can be played back while the image is shown.
It would be highly desirable to provide a sound encoding system and method that enables the embedding of audio information directly within a picture or video image itself, and enables the playback or audio presentation of the embedded audio information associated with the viewed picture or video image.
SUMMARY OF THE INVENTION
The present invention relates to a system and method for encoding sound information in pixel units of a picture or image, and particularly the pixel intensity. Small differences in pixel intensities are typically not detectable by the eye, however, can be detected by scanning devices that measure the intensity differences between closely located pixels in an image, which differences are used to generate encoded numbers which are mapped into sound representations (e.g., cepstra) that are capable of forming audio or sound.
According to a first embodiment, one can measure digital pixel values in numbers of intensity that follows after some decimal point. For example, a pixel intensity may be represented digitally (in bytes/bits) as a number, e.g., 2.3567, with the first two numbers representing intensity capable of being detected by a human eye. Remaining decimal numbers however, are very small and may be used to represent encoded sound/audio information. As an example of such an audio encoding technique, for a 256 color (or gray scale) display, there are 8 bits per pixel. Current high-end graphic display systems utilize 24 bits per pixel: e.g., 8 bits for red, 8 bits for green, and 8 bits for blue; resulting in 256 shades of red, green and blue which may be blended to form a continuum of colors. According to the invention, if 8 bits per pixel quality is acceptable, then using a 24 bits per pixel graphics system, there remains 16 bits left for which audio data may be represented. Thus, for an 1000×1000 image there may be 16 Kbits for sound effects which amount is sufficient to represent short phrases or sound effects (assuming a standard representation of a speech waveform requires 8 Kbits/sec).
According to a second embodiment, audio information may be encoded in special pixels located in the picture or image, for example, at predetermined coordinates. These special pixels may have encoded sound information that may be detected by a scanner, however, are located at special coordinates in the image in a manner such that the overall viewing of the image is not affected.
In accordance with these embodiments, a scanning system is employed which enables a user to scan through the picture, for instance, with a scanning device which sends the pixel encoded sound information to a server system (via wireless connection, for example). The server system may include devices for reading the pixel encoded data and converting the converted data into audio (e.g., music, speech etc.) for playback and presentation through a playback device.
The pixel encoded sound information may additionally include “meta information” provided in a file format such as Speech Mark-up language (Speech ML) for use with a Conversational Browser.
Advantageously, the encoded information embedded in a picture may include device-control codes which may be scanned and retrieved form controlling a device.
REFERENCES:
patent: 5530759 (1996-06-01), Braudaway et al.
patent: 6209094 (2001-03-01), Levine et al.
patent: 6353672 (2002-03-01), Rhoads
patent: 6363159 (2002-03-01), Rhoads
patent: 6442283 (2002-08-01), Tewfil et al.
patent: 6535617 (2003-03-01), Hannigan et al.
“Safeguarding Your Image”, by Eric J. Lerner, Brainstorm—Deep Computing can predict what people will buy, create the ideal schedule, design better drugs—and even tell you when to open your umbrella, pp. 27-28.
“An Overview of Speaker Recognition Technology”, by Sadaoki Furui, Automatic Speech and Speaker Recognition, Kluwer Academic Publishers, pp. 31-36.
Kanevsky Dimitri
Maes Stephane
Pickover Clifford A.
Zlatsin Alexander
International Business Machines - Corporation
Morris, Esq. Daniel P.
Patel Jayanti K.
Scully Scott Murphy & Presser
Tabatabai Abolfazl
LandOfFree
System and method for coding audio information in images does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for coding audio information in images, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for coding audio information in images will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3295950