Method and system for embedding information in document

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S180000, C382S177000, C382S200000, C713S176000, C715S252000

Reexamination Certificate

active

06782509

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a method for embedding additional information, including text data, i.e., so-called electronic watermark information, in an electronic document, a method for preventing the destruction of such embedded information, a method for preventing the re-use of such embedded information, and a system therefor.
BACKGROUND OF THE INVENTION
As a large amount of information can be distributed across the Internet, or by using CD-ROMs, businesses that provide services for the conduct of electronic searches and for the distribution of documents containing digital data have become important. To ensure the safe development of such businesses, techniques that can provide for the management of copyrighted material contained in digital documents that are distributed and that can protect the rights of owners are indispensable. Such techniques are also required by companies that wish to protect secret material contained in digital documents, and to find and trace routes along which secrets may have been leaked.
The techniques applied for managing copyrighted electronic data can be roughly broken down into the two techniques of access control, for which encryption and authentication are employed, and electronic watermarking. The aim of the first technique is to ensure that access to the contents of selected digital material is limited to those users who pay for the privilege, or to users whose employment of the material is controlled by a manager. The latter technique provides a function by which the secondary outflow of decoded data contained in digital documents can be prevented, or can be traced. These two techniques must be combined in order to provide for the rigorous management of copyrighted material.
Among the various types of media, there is a very large demand for the use of the electronic watermark technique for text data that are distributed in volume. However, since in pure text data there is little redundancy in the expression of information, it is very difficult to embed information that supplements the original contents, i.e., electronic watermark information. In “Proposal for the digital watermarking of PostScript and PDF documents,” Ryujiro Shibuya, Yuichi Kaji and Tadao Kasa, SCIS98-9.2.E (prior art 1), Japanese Unexamined Patent Publication No. Hei 7-222000 (prior art 2) and Japanese Unexamined Patent Publication No. Hei 6-324625 (prior art 3), a technique is proposed whereby watermark information is embedded in the document description, to include appearance and layout, while the focus is on the fact that a page description form, such as PS (PostScript) or PDF (Portable Document Format), tends to be employed for the actual distribution of text data. In the above prior art, slight changes in line spacing and word spacing and in fonts are employed to embed information in documents.
However, it is difficult to use the above described conventional technique to manage copyrights, or to specify a route along which secrets may have been leaked, when the following two conditions are not satisfied.
1. Detection of a watermark in data contained in multiple documents can be performed only by a user who possesses a common detection key.
2. The technique is sufficiently robust that during ordinary distribution processing it can prevent format conversions and the destruction of material by an unauthorized user.
However, prior art 1 does not teach a specific detection method that can satisfy condition 1. And the method described in prior arts 2 and 3 requires a comparison with the original document data, except for a method for manipulating the base line of character lines. Since information for watermark detection must be recorded and managed for each document in which a watermark has been embedded, in a large system this method is difficult to use. None of the above methods supports the protection system for which a key is used (a system according to which only a key owner is permitted to detect a watermark).
As for condition 2, only a study of the re-scanning of printed data has been written in prior art 2, and no consideration has been given to rendering page description data sufficiently robust to prevent its own destruction. Actually, many page description formats are open to the public, and may be destroyed. For example, a watermark embedded in a line space by manipulating the base line can be easily destroyed by slightly adjusting the positioning of the individual lines and by maintaining a constant width. In addition, only pure text data in which no watermark information is embedded may be extracted from page description data and employed.
In Japanese Unexamined Patent Publication No. Hei 8-348426 (prior art 4), a method is proposed for embedding a watermark using the statistic property of two sequences of locations. Although the technique described in prior art 4 is not an invention related to the electronic watermarking of text, this technique satisfies condition 1, and as far as condition 2 is concerned, it enables sufficiently robust embedding when the locations are changed at random. However, it is difficult to adapt this technique to page description. If this technique is employed for page description, a method for designating the sequence of locations is not obvious, which differs from the embedding of a watermark in an image. An object whose location is to be adjusted must be uniquely identified when embedding a watermark. The page description constitutes a set of page description objects (characters or character strings), including positional information, and does not include information for identifying and ordering the individual elements. While, for an image, pixels and small domains can be specified by using X and Y coordinates, in page description an object whose location can be adjusted in a specific domain designated by the coordinates is not always present once a document and a page are changed. Since in page description the order in which objects are positioned on a page image does not affect the appearance of the image, the order in which the objects appear in a file format is of no help when an object is being specified. Actually, the order in which objects appear in a file may be changed as the result of format conversion or of an attempt by a third party to destroy them (an attack).
Furthermore, none of the above prior art examples provides for the resolution of a problem whereof only pure text data are extracted from page description data. Since the specifications employed for page description that are distributed across the Internet are open to the public, an adequate program need only be formed that can extract only text data mechanically. In addition, the display software for page description frequently supports the delivery of data to another program using Cut&Paste. In this case, a common user can extract text. The PDF display software employs a password to control access permission and to prohibit the use of Cut&Paste. In the current system, however, if printing is permitted, only a PDF->PS->PDF conversion (information for managing a password is omitted through the conversion into PS) need be performed to remove protection. For some applications, therefore, text may be extracted from page description data and illegally traded.
It is, therefore, one object of the present invention to provide a method and a system for embedding information in document data that include text written in a page description language.
It is one more object of the present invention to provide a method and a system for detecting embedded information in document data that include text written in a page description language.
It is another object of the present invention to provide a method, for embedding an electronic watermark in document data that include text written in a page description language, whereby a common detection key is employed to detect electronic watermarks in multiple documents, and a system therefor.
It is an additional object of the present invention to provide a method for embedding information in docu

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for embedding information in document does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for embedding information in document, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for embedding information in document will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3296261

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.