Long term archiving of digital information

Data processing: software development – installation – and managem – Software installation – Including multiple files

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S147000, C715S252000

Reexamination Certificate

active

06691309

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of Invention
The present invention relates generally to the field of archiving digital information. More specifically, the present invention is related to creating and storing a model of a universal virtual computer enabling recovery of long time archived digital information.
2. Discussion of Prior Art
The report of the Task force on Archiving of Digital Information, commissioned by the Commission on Preservation and Access and the Research Libraries Group states: “The digital information is still relatively uncultivated at this stage; but the need is urgent, the time is opportune and the conditions are fertile for a strong, far-sighted set of actions to plant the appropriate seeds to help ensure that the digital record ultimately matures and flourishes.” The same opinion is also voiced by the industrial sector which sees more and more of their vital data generated and stored in digital form.
There is currently a very limited amount of related activity in the computer science community. This is probably due to the inherent long-term aspect of the problem when so many short term issues may offer a more rapid pay-off.
The following describes some of the technical challenges and prior art solutions.
The problem that libraries are facing today is well known. For centuries, paper has been used as the medium of choice for storing text and images. As shown in
FIG. 1
, a “paper” document has the advantages of: being a physical object with permanency, remaining readable with a slow degradation rate, remaining understandable (i.e., its structure is known), and being readily available to the reader.
Today, some of the archived objects (books, newspapers, pictures, etc.) are in danger of destruction. What should be done to protect their contents? They could essentially be copied (on paper or microfilm) or digitized. Digitization through a digital camera or a scanner replaces the image by a bit stream. This offers many advantages. First, the object can be copied repeatedly without degradation; its contents can be sent remotely and can be accessed at will. Finally, the physical space needed to store the object becomes smaller and smaller as storage density increases.
Another argument for digitization is that a high percentage of the data to be preserved is, today, generated directly in digital form. Musical CD's or DVD movies are obvious examples. But the same is true of many engineering designs which were described as blueprints in the past and now exist as digital information in a Computer-Aided-Design system with multimedia, relational database, and virtual reality. And what about all the electronically sent messages that have replaced the memos and letters?
FIG. 2
illustrates an electronic conversion
213
of existing paper text
202
and images
204
(e.g. books
200
) and recorded media comprising sound
208
(e.g. records) and/or video
210
(e.g. films) to digital data
216
. In addition to converted physical or analog sources, data created by electronic processes
214
, such as e-mail, word processors, digital camera, etc.
In the future, the volume of the digital information will increase exponentially and dwarf the volume of the existing paper information. Thus, it makes sense to digitize what needs to be saved of the past, and concentrate on the single problem of preserving digital information for posterity.
FIG. 3
illustrates some of the problems with the storage of information as digital data. A particular storage medium
300
, such as a disk, will have a limited physical lifetime. At a later time in the future it is unknown if a machine reader
302
will still be compatible or if the data bit string
304
will remain readable. As technology changes, no guarantees exist for a proper interpretation of bit strings to produce the information they originally represented
306
.
FIG. 4
illustrates the steps needed to decode the data.
Suppose we use a computer (identified as M2000) to create and manipulate digital information today. For the purpose of archiving the data for preservation, the digital information is stored on a removable medium, say D2000 (most probably some kind of disk). Suppose that, in 2100, somebody (the client) wants to access the data saved today. What mechanism should exist to be able to satisfy the request?
Four conditions must be met:
1. The particular D2000 disk must be found.
2. D2000 must be physically intact.
3. A machine must be available to read the raw contents (bit stream) of D2000.
4. The bit stream must be correctly interpreted.
Condition 1: this is not a new problem; any digital object must be “published” under a certain name, catalogued, and stored in a safe place; some attributes may also be stored, such as date, author, title, etc. All this is not different from the data maintained by current libraries.
Condition 2: some researchers predict very long lifetimes for certain types of media, but others are much less optimistic. Anyway, if a medium is good for N years, what about preservation for N+1 years? Whatever N is, the problem does not go away. There really seems to be only one solution to this problem: to copy the information periodically to rejuvenate the medium.
Condition 3: machines that are technologically obsolete are hard to keep in working order for a long time. Actually, this condition is more stringent than the previous one. Here also, rejuvenation is needed, moving the information onto the new medium that can be read by the latest generation of devices. Thus, conditions 2 and 3 go hand-in-hand. It must be noted that rejuvenation is not simply an overhead for preservation; it also allows for using the latest storage technology.
The three conditions above ensure that a bit stream saved today will be readable, as a bit stream, in the future. But there still remains one additional condition.
Condition 4: one must be able to decode the bit stream to recover the information in all its meaning. This is quite a challenging problem.
Digital objects can vary greatly in complexity. A digital object generally corresponds to what we designate as a file today. It contains either data or an executable program. We identify the following three types:
Type 1. A data object may be readily understandable by a human reader, or it may have to be decoded in some way by the reader or by a machine (assuming one knows the decoding rules). In the latter case, a program must be written in 2100 to decode the data, based on the stored description. A text in ASCII, an image, a digital video clip, a table with ASCII fields, are all examples of simple data objects.
Type 2. If the encoding of the data becomes more complex (example: an image compressed by a JPEG algorithm), the best way to describe the algorithm is to store with the data a program that can be used to decode the data.
Type 3. Going a step further, we may also be interested in archiving a computer program or system for its own sake. In this case, it is the ability to run that program that must be preserved by the archiving mechanism. Not only the bit stream that constitutes the program must be archived, but we must also make sure that the code can be executed at restore time. If you want to preserve the look and feel of Window 95 or MAC, or the user interface of a Computer Aided Design system, the only solution is to archive the whole body of code used during the execution, and enough information on how to run the code at restore time.
Below, we lump together types 1 and 2 under the heading of data archiving: this is because the same technique applies to both types. Type 3 is referred to as program archiving.
Previous Proposals
In
Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation,
a report to the Council on Library and Information Resources (January 1999), J. Rothenberg sketched out an overall system organization based on encapsulating everything needed to decode the information when needed.
In summary, he proposes to store in an encapsulated object
500
:
A. a description of the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Long term archiving of digital information does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Long term archiving of digital information, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Long term archiving of digital information will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3316269

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.