Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1999-08-17
2002-09-10
Beausoleil, Robert (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S006130, C714S770000, C707S793000
Reexamination Certificate
active
06449731
ABSTRACT:
BACKGROUND
The invention relates generally to computer system storage and more particularly to mechanisms (methods and devices) for providing distributed computer system storage having proxy backup/stand-in capability.
It is common for organizations to employ large numbers of computers for tasks such as data storage. Typically, some or all of an organization's computers may be interconnected to form a network whereby two or more computer systems are interconnected so that they are capable of exchanging information. With the adoption of computer network technology came the desire for increased storage capacity. Increased storage capacity, in turn, led to a need to distribute file systems across networked computers. In general, distribution of file systems is done by software applications that keep track of files stored across a network. One goal of distributing file systems is to allow a user/application of one computer (or node) in a computer network to access data or an application stored on another node in the computer network. Another goal of distributing file systems is to make this access transparent with respect to the stored object's physical location.
FIG. 1
shows a computer system employing distributed file system technology in accordance with the prior art. As shown, node-A
100
and node-B
102
are interconnected by communication link
104
. Illustrative nodes include specialized or general purpose workstations and personal computers. An illustrative communication link employs coaxial or twisted pair cable and the transport control protocol (TCP). Each node A and B executes a local version of a distributed file system,
106
and
108
respectively. Each distributed file system manages the storage of objects to/from a storage unit (e.g.,
110
and
112
), each of which may include one or more storage devices. Illustrative storage devices include magnetic disks (fixed, floppy, and removable) and optical media such as CD-ROM disks.
One well known distributed file system is the Network File System (NFS®) from Sun Microsystems, Incorporated of Palo Alto, Calif. In NFS, a server node may make its file system (in part or in whole) shareable through a process known as “exporting.” A client node may gain access to an exported file system through a process known as “mounting.” Exporting entails specifying those file systems, or parts thereof, that are to be made available to other nodes (typically through NFS map files). Mounting adds exported file systems to the file structure of a client node at a specified location. Together, the processes of exporting and importing define the file system namespace.
For example, consider
FIG. 2
in which node
200
has local file system
202
including directories X, Y, and Z, and node
204
has local file system
206
including directories &agr;, &bgr;, and &ggr;. If node
204
exports, and node
200
imports file system
206
(often referred to as cross-mounting), node
200
may have combined system namespace
208
. From directory structure
208
, a user/application on node
200
may access any data object in remote directories &agr;, &bgr;, and &ggr;as if &agr;, &bgr;, and &ggr; were local directories such as X or Y.
One significant feature of distributed storage such as that illustrated in
FIG. 2
, is that all references to an object stored in directory &agr; by a user at node
200
(i.e., through combined file system namespace
208
) are resolved by the file system local to and executing on node
204
. That is, the translation of an object's reference to the physical location of that object is performed by the file system executing on node
204
. Another significant feature of current distributed file systems such as NFS® is that the processes of exporting and importing must be performed for each new directory to be shared. Yet another significant feature of current distributed file systems is that shared storage (e.g., mount points &agr;, &bgr;, and &ggr;) appear as discrete volumes or nodes in file system namespace. In other words, an exported file system (or part thereof) appears as a discrete objects of storage in the namespace of each importing node. Thus, system namespace is fragmented across multiple storage nodes. To export a single directory from a node to all other nodes in a computer network, not only must the exporting node's map of objects (or its equivalent) be updated to specify the directory being exported, but every node wanting to import that directory must have its map of objects updated. This may happen frequently as, for example, when additional storage is added via a new storage node being attached to the network, and requires significant administrative overhead for each such occurrence.
Thus, it would be beneficial to provide a distributed storage mechanism that reduces administrative overhead associated with sharing memory and unifies the shared system namespace.
SUMMARY
In one embodiment the invention provides a method to manage storage of an object in a computer system having a first and a second storage management process, wherein the stored object includes a data portion, a metadata portion and a fault tolerance data portion. The method includes receiving a memory access request from a client process, routing the memory access request to the first storage management process, determining the first storage management process has failed, routing the memory access request to the second storage management process (the second storage management process having access to the fault tolerance data portion), receiving a result from the second storage management process, and returning at least a potion of the result to the client process. The method may also include reconstructing at least a portion of the metadata portion, identifying the fault tolerance data portion based on the reconstructed portion of the metadata portion, modifying the fault tolerance data portion in accordance with the memory access request, and storing the modified fault tolerance data. Additionally, a record (journal) of the changes made to the fault tolerance data portion may be maintained by the second storage management process and transmitted to the first storage management process when it be comes operational. Methods in accordance with the invention may be stored in any media that is readable and executable by a computer system.
REFERENCES:
patent: 4722085 (1988-01-01), Flora et al.
patent: 4761785 (1988-08-01), Clark et al.
patent: 4817035 (1989-03-01), Timsit
patent: 4941059 (1990-07-01), Grant
patent: 5130992 (1992-07-01), Frey, Jr. et al.
patent: 5274645 (1993-12-01), Idleman et al.
patent: 5390327 (1995-02-01), Lubbers et al.
patent: 5522031 (1996-05-01), Ellis et al.
patent: 5524204 (1996-06-01), Verdoorn, Jr.
patent: 5546535 (1996-08-01), Stallmo et al.
patent: 5574882 (1996-11-01), Menon et al.
patent: 5615352 (1997-03-01), Jacobson et al.
patent: 5623595 (1997-04-01), Bailey
patent: 5768623 (1998-06-01), Judd et al.
patent: 5790775 (1998-08-01), Marks et al.
patent: 5826001 (1998-10-01), Lubbers et al.
patent: 5848241 (1998-12-01), Misinai et al.
patent: 5870757 (1999-02-01), Fuller
patent: 5875456 (1999-02-01), Stallmo et al.
patent: 5875457 (1999-02-01), Shilit
patent: 5928367 (1999-07-01), Nelson et al.
patent: 5933592 (1999-08-01), Lubbers et al.
patent: 5933834 (1999-08-01), Aichelen
patent: 5960446 (1999-09-01), Schmuck et al.
patent: 5974503 (1999-10-01), Venkatesh et al.
patent: 5987621 (1999-11-01), Duso et al.
patent: 5999930 (1999-12-01), Wolff
patent: 6000010 (1999-12-01), Legg
patent: 6021463 (2000-02-01), Belser
patent: 6029168 (2000-02-01), Frey
patent: 6032216 (2000-02-01), Schmuck et al.
patent: 6035373 (2000-03-01), Iwata
patent: 6041423 (2000-03-01), Tsukerman
patent: 6058400 (2000-05-01), Slaughter
patent: 6073218 (2000-06-01), DeKoning et al.
patent: 6148414 (2000-11-01), Brown et al.
patent: 6173291 (2001-01-01), Jenevein
patent: 0 709 779 (1996-05-01), None
patent: WO 97/22054 (1997-06-01), None
patent: WO 99/09479 (1999-02-01), None
TRANSARC™ Website print-out: The AFS System In
Beausoleil Robert
Chu Gabriel
Tricord Systems, Inc.
Wong, Cabello, Lutsch, Rutherford & Brucculeri P.C.
LandOfFree
Self-healing computer system storage does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Self-healing computer system storage, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Self-healing computer system storage will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2895562