Self-healing computer system storage

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C714S052000, C714S764000

Reexamination Certificate

active

06530036

ABSTRACT:

BACKGROUND
The invention relates generally to computer system storage and more particularly to mechanisms (methods and devices) for providing access to objects stored on media managed by a failed storage access management process or device.
It is common for organizations to employ large numbers of computers for tasks such as data storage. Typically, some or all of an organization's computers may be interconnected to form a network whereby two or more computer systems are interconnected so that they are capable of exchanging information. With the adoption of computer network technology came the desire for increased storage capacity. Increased storage capacity, in turn, led to a need to distribute file systems across networked computers. In general, distribution of file systems is done by software applications that keep track of files stored across a network. One goal of distributing file systems is to allow a user/application of one computer (or node) in a computer network to access data or an application stored on another node in the computer network. Another goal of distributing file systems is to make this access transparent with respect to the stored object's physical location.
FIG. 1
shows a computer system employing distributed file system technology in accordance with the prior art. As shown, node-A
100
and node-B
102
are interconnected by communication link
104
. Illustrative nodes include specialized or general purpose workstations and personal computers. An illustrative communication link employs coaxial or twisted pair cable and the transport control protocol (TCP). Each node A and B executes a local version of a distributed file system,
106
and
108
respectively. Each distributed file system manages the storage of objects to/from a storage unit (e.g.,
110
and
112
), each of which may include one or more storage devices. Illustrative storage devices include magnetic disks (fixed, floppy, and removable), magnetic tape units and optical media such as CD-ROM disks.
One well known distributed file system is the Network File System (NFS®) from Sun Microsystems, Incorporated of Palo Alto, Calif. In NFS®, a server node may make its file system (in part or in whole) shareable through a process known as “exporting.” A client node may gain access to an exported file system through a process known as “mounting.” Exporting entails specifying those file systems, or parts thereof, that are to be made available to other nodes (typically through NFS® map files). Mounting adds exported file systems to the file structure of a client node at a specified location. Together, the processes of exporting and importing define the file system namespace.
For example, consider
FIG. 2
in which node
200
has local file system
202
including directories X, Y, and Z, and node
204
has local file system
206
including directories &agr;, &bgr;, and &ggr;. If node
204
exports, and node
200
imports file system
206
(often referred to as cross-mounting), node
200
may have combined system namespace
208
. From directory structure
208
, a user/application on node
200
may access any data object in remote directories &agr;, &bgr;, and &ggr;, as if &agr;, &bgr;, and &ggr; were local directories such as X, Y or Z.
One significant feature of distributed storage such as that illustrated in
FIG. 2
, is that all references to an object stored in directory a by a user at node
200
(i.e., through combined file system namespace
208
) are resolved by the file system local to and executing on node
204
. That is, the translation of an object's reference to the physical location of that object is performed by the file system executing on node
204
. Another significant feature of current distributed file systems such as NFS® is that the processes of exporting and importing must be performed for each new directory to be shared. Yet another significant feature of current distributed file systems is that shared storage (e.g., mount points &agr;, &bgr;, and &ggr;) appear as discrete volumes or nodes in file system namespace. In other words, an exported file system (or part thereof) appear as one or more discrete objects in the namespace of each importing node. Thus, system namespace is fragmented across multiple storage nodes. To export a single directory from a node to all other nodes in a computer network, not only must the exporting node's map of objects (or its equivalent) be updated to specify the directory being exported, but every node wanting to import that directory must have its map of objects updated. This may happen frequently as, for example, when additional storage is added via a new storage node being attached to the network, and requires significant administrative overhead for each such occurrence. A corollary of the need to cross-mount shared directories is that if the node exporting a directory (directory &agr;, for example) fails, all access to information stored at that node is lost until the node is restarted.
Thus, it would be beneficial to provide distributed storage mechanisms (methods and devices) that reduce administrative overhead associated with sharing memory, unify shared system namespace, and provide users access to data presumptively maintained by a failed storage access manager.
SUMMARY
In one embodiment the invention provides a method to process a memory access request by a distributed storage management process, where the memory access request is directed to a stored object having a data portion, a data fault tolerance portion, a metadata portion and a metadata fault tolerance portion. If the storage management process responsible for managing the stored object is not available, the method includes reconstructing at least a part of the metadata portion in accordance with metadata fault tolerance information, locating at least a part of the data fault tolerance portion based on the reconstructed metadata, reconstructing a part of the data portion corresponding to the located data fault tolerance part, modifying the reconstructed data part, modifying the data fault tolerance portion in accordance with the modified reconstructed data part, indicating (in the metadata fault tolerance portion) that the data fault tolerance portion has been modified, and modifying a contents object to indicate the metadata fault tolerance portion has been modified, wherein the contents object is associated with a second and distinct storage management process.
In another embodiment, the invention provides a method to initialize a storage management process that makes use of the modified data and metadata fault tolerance information. This method includes reconstructing a contents object (associated with the storage management process) based on a contents fault tolerance object, determining a value of an object-modified indicator in the reconstructed contents object (the object-modified indicator associated with a stored object), and reconstructing at least a part of the stored object's data portion based on at least a part of the stored object's data fault tolerance portion if the object-modified indicator has a first value.
Methods in accordance with the invention may be stored in any media that is readable and executable by a programmable control device such as a computer processor or custom designed state machine.


REFERENCES:
patent: 4604750 (1986-08-01), Manton et al.
patent: 4722085 (1988-01-01), Flora et al.
patent: 4761785 (1988-08-01), Clark et al.
patent: 4817035 (1989-03-01), Timsit
patent: 4941059 (1990-07-01), Grant
patent: 5130992 (1992-07-01), Frey, Jr. et al.
patent: 5151987 (1992-09-01), Abraham et al.
patent: 5390327 (1995-02-01), Lubbers et al.
patent: 5485475 (1996-01-01), Takagi
patent: 5522031 (1996-05-01), Ellis et al.
patent: 5524204 (1996-06-01), Verdoorn, Jr.
patent: 5574882 (1996-11-01), Menon et al.
patent: 5615352 (1997-03-01), Jacobson et al.
patent: 5623595 (1997-04-01), Bailey
patent: 5826001 (1998-10-01), Lubbers et al.
patent: 5875456 (1999-02-01), Stallmo et al.
patent: 5875457 (1999-02-01), Shilit
patent: 5933592 (

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Self-healing computer system storage does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Self-healing computer system storage, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Self-healing computer system storage will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3006183

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.