Mechanism for replicating and maintaining files in a...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000

Reexamination Certificate

active

06636878

ABSTRACT:

FIELD OF THE INVENTION
This invention relates generally to computer systems, and more particularly to a mechanism for replicating and maintaining files in a space-efficient manner.
BACKGROUND
In a computer system, files are used for many purposes, such as to organize information, to store data, or to contain applications or a list of commands. The term “file” as used herein refers broadly to any logical entity that can be accessed, used or manipulated as a container by entities such as system users, applications, and other resources. While a file can be associated with several properties, including but not limited to, a filename, a file descriptor, and a set of blocks that contain the contents or data of the file, it should be noted that these are just properties of the file and not the file itself. Put another way, the properties are just manifestations of the file, while the file itself is the logical entity that is being manipulated.
When a file is copied on a computer system, a duplicate of the file is created. The duplicate typically has a different file name, but initially it will have the same contents as the original. The contents of the duplicate file are stored on previously unused space in the computer system. For example, if a file on a computer hard drive with a size of 1 megabyte is copied to a new file, the latter will occupy an additional 1 megabyte of storage space on the hard drive.
Replicating large files can result in an inefficient use of system resources. For example, when a copy of a file is later modified, only a small portion of the contents of the copy may differ from the original. However, because both the original and the copy occupy their own space on the system, much of the space occupied by the copy is needlessly duplicated.
For example, consider a large word processing file. The author of the document may want to save different versions as it is being written or edited, but most of the contents of the file may remain exactly the same. As new versions are created and modified, only the data blocks for each version that are associated with the modified content will be changed, leaving unmodified the remainder of the data blocks for the file. As a result, most of the data storage blocks associated with the different versions of the file are exactly the same, yet for each separate version of the file, a separate copy of each of those unchanged data blocks will exist. As the size of the file increases and/or the number of copies increases, the number of duplicated data blocks increases, resulting in an inefficient use of the system's storage capacity.
Note that it is important to distinguish copying a file from another form of file manipulation called linking. A link can be created between two file names such that both names refer to the same file. For example, in the Unix operating system, the link command can be used to associate a new file name with an existing file name and the contents of that existing file. The result is that there is still only one set of data blocks (or content), but now the file can be referred to by both the original and new file name. If the content of the file is changed, then that change is reflected in the file regardless of which linked file name is used to refer to the file. Thus, linking is different from copying in that copying creates multiple, independent files, whereas with linking there is only one file that has multiple names instead of two distinct files.
One approach for creating copies of data without duplicating the information that remains the same between the original data and a copy of that data is the “copy-on-write” (C-O-W) technique. The basic idea of copy-on-write is that an original and a copy share the portions of the data that remain the same between the original and the copy. As data is changed in either the original or the copy, new data portions are created to reflect the changes, and such data portions are now specific to the original or the copy. However, data portions that remain the same between the original and the copy continue to be shared.
For example, some versions of the Unix operating system, such as Solaris by Sun Microsystems and Mach by Carnegie Mellon University, utilize copy-on-write memory. With this approach, two processes can share memory blocks in the computer system's memory until one process writes to a particular memory block. At that point, the process that writes to the particular memory block gets its own private copy of that memory block, and the original memory block is no longer shared between the two processes.
FIGS. 1A
,
1
B, and
1
C provide a simple illustration of the sharing of memory blocks between two processes. The system illustrated in
FIGS. 1A and 1B
has a memory
100
that is comprised of a plurality of memory blocks that store data or information. For purposes of explanation, only memory blocks
110
,
120
,
130
,
140
,
150
, and
160
are shown. In
FIG. 1A
, memory blocks
110
,
120
, and
130
are associated with a process
102
. Also in
FIG. 1A
, memory blocks
110
,
120
, and
130
are associated with a process
104
, which initially is using the same information as process
102
.
If process
104
then makes a change to some of the information that is stored in memory block
130
, the information in memory block
130
is copied to an unused memory block, such as memory block
140
. Then memory block
140
is modified to reflect the change in the information.
FIG. 1B
shows the result of this change. Process
104
is now associated with memory blocks
110
,
120
, and
140
, but process
104
is no longer associated with memory block
130
. Meanwhile, process
102
remains associated with memory blocks
110
,
120
, and
130
. Thus, in
FIG. 1B
, memory blocks
110
and
120
are shared by processes
102
and
104
, since both those processes are using the same information stored in those memory blocks. However, because the information in memory block
130
that was originally shared by processes
102
and
104
is now different for the two processes, process
102
remains associated with memory block
130
while process
104
is now associated with memory block
140
.
FIG. 1C
shows what would happen if no sharing of the memory blocks by the processes were allowed. In this case, process
102
is associated with memory blocks
110
,
120
, and
130
while process
104
is associated with memory blocks
140
,
150
, and
160
. After the change in the information in memory blocks
130
and
160
between the two processes, the contents of memory block
130
and memory block
160
will be different. The contents of memory blocks
110
and
140
remain the same, and similarly the contents of memory blocks
120
and
150
remain the same. Thus, if memory blocks are not shared, the system will be storing exact duplicates of the contents of memory blocks
110
and
120
in memory blocks
140
and
150
, respectively, which is an inefficient use of the system's memory capacity.
Another implementation of copy-on-write can be found in some file systems that use “snapshots” to provide a backup feature to allow users to retrieve older versions of a file. For example, Network Appliance offers a file system called “write anywhere file layout” (WAFL), and the Veritas file system (VxFS) contains a similar feature. With this type of backup feature, a snapshot is taken of the entire file system at a given point in time, effectively freezing the state of the files at that moment. Later after the snapshot is taken, if any changes are made to the files on the file system, then new data blocks are created and modified to reflect the changes to the contents of each of the changed files. This means that as files are changed following the snapshot, new data blocks are used to reflect changes in the contents of the files, but unchanged data blocks continue to be shared between the snapshot and the current working versions of the files.
With this backup approach, any current versions (or working versions) of the files being used following the snapshot

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Mechanism for replicating and maintaining files in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Mechanism for replicating and maintaining files in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Mechanism for replicating and maintaining files in a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3140153

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.