Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-11-03
2002-04-02
Ray, Gopal C. (Department: 2181)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S732000, C709S203000
Reexamination Certificate
active
06367029
ABSTRACT:
BACKGROUND OF THE INVENTION
The invention relates to a file server system of the kind tolerant to software and hardware failures.
Reliability of a file server system is a measure of the continuity of failure-free service for a particular system and in a particular time interval. Related to this is mean-time-between-failures which defines how long the system is expected to perform correctly.
Availability measures the system's readiness to serve. One definition of availability is the percentage of time in which the system performs correctly in a given time interval. Unlike reliability, availability depends on system recovery time after a failure. If a system is required to provide high availability for a given failure model, i.e. for a defined set of possible failures, it has to provide fast recovery from the defined set of failures.
A number of existing file servers provide an enhanced level of availability for some specific failure models. Such file servers are sometimes referred to in the art as highly-available file servers. The mechanisms which are often used for this are based on some or all of the following:
(1) primary/back-up style of replicated file service;
(2) the use of logging for faster recovery, the log being kept on disk or in non-volatile memory;
(3) checksumming to protect data integrity while the data is stored on disk or while it is being transferred between the server's nodes; and
(4) reliable group communication protocols for intra-server communication.
It is an aim of the present invention to provide a file server system which is tolerant to software and hardware failures.
SUMMARY OF THE INVENTION
Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims. Features of the dependent claims may be combined with those of the independent claims as appropriate and in combinations other than those explicitly set out in the claims.
According to a first aspect of the invention there is provided a file server system for storing data objects with respective object identifiers and for servicing requests from remote client systems specifying the object identifier of the requested object. The system comprises a file store for holding stored objects with associated object identifiers. The system further comprises a signature generator for computing an object-specific signature from an object, a signature checker comprising a signature store for holding a previously computed signature for each of the stored objects, and a comparator operable to compare, on the basis of a specified object identifier, a signature retrieved from the signature store with a corresponding signature computed by the signature generator from an object retrieved from the file store.
The location of the signature generator may be associated with the file store and the location of the comparator may be associated with the checker. If the file store is replicated, a signature generator may be provided at each file store replica location. Similarly, if the checker is replicated, a comparator may be provided at each checker replica location.
Signatures computed at the time of object storage are thus archived in the checker for later reference to provide an independent record of the integrity of the data stored in the file store. When an object is retrieved from file store, a signature for it can be computed by the signature generator and compared with the archived signature for that object. Any difference in the respective signatures will thus be an indicator of data corruption which can then be acted upon according to defined single point failure procedures, for example.
In the first aspect of the invention, the system preferably has an operational mode in which a decision is made as to whether to perform a comparison check in respect of an object on the basis of profile information for that object. Profile information may be supplied with the request being serviced and may be stored for each object or for groups of objects in the file store with profile information supplied with the request taking precedence.
According to a second aspect of the invention there is provided a file server system for storing data objects with respective object identifiers and for servicing requests from remote client systems specifying the object identifier of the requested object. The system is constituted by a plurality of replicable components which may or may not be replicated in a given implementation or at a particular point in time. The replication is preferably manageable dynamically so that the degree of replication of each of the replicable components may vary during operation. Alternatively the replication levels may be pre-set at the level of the system administrator.
Replication is handled by a replication manager. The replication manager is configured to allow for nodes leaving and joining the system by respectively reducing and increasing the number of replicas of each of the replicable components affected by the node transit. A failure detector is also provided. The failure detector is not replicable, but is preferably distributed over the system nodes by having an instance running on each node. The failure detector has an object register for storing a list of ones of the system objects and is configured to monitor for failure of any of the system objects listed in the object register and, on failure, to report such failure to the replication manager. For each system object on the failure detector list, there may be stored a secondary list of other ones of the system objects that have an interest in the health of that system object. The failure detector is then configured to report failure of that object not only to the replication manager but also to each of the objects on the secondary list. The replication manager preferably records for each of the replicated components a primary of the component concerned and is configured to select a new primary when a node hosting a primary leaves the system.
For enhanced reliability and availability, the file store is preferably replicated with a replication level of at least two, i.e. with a primary copy and at least one back-up copy. Another system component which may be replicable is a checker. The checker has a signature store for holding object-specific signatures computed for each of the objects stored in the file store.
A logger may also be provided to allow faster recovery in respect of nodes rejoining the system, for example after failure. The logger may also be replicated. The logger serves to maintain a log of recent system activity in non-volatile storage which can be accessed when a node is rejoining the system.
In the preferred embodiment, the file server system is located over a plurality of nodes, typically computers or other hardware elements. For operation, the file server system is connected to a network to which is also connected a plurality of client apparatuses that may wish to access the data stored in the file server system. The nodes of the file server system act as hosts for software components of the file server system. Several of the software components can be replicated. The replicable software components include: the system file store, a checker and a logger. The functions of these components are described further below. A replicated component has one primary copy and one or more back-up copies. Among the replicas of a given component, the primary may change through a process referred to as primary re-election, but there is only ever one primary at any one time for a given component. Generally it is desirable for reliability that replica copies of a given replicated component are each located at different nodes, or at least that the primary and one of the back-ups are located on different nodes. Thus, a given node may be host to the primaries of several different software components and to several back-ups. Location and handling of replica copies of a given replicable component is under the control of a replication manager which is a (non-replicable) software component
Mayhead Martin
Parrington Graham
Radley James
Starovic Gradimir
O'Melveny & Myers LLP
Ray Gopal C.
Sun Microsystems Inc.
LandOfFree
File server system tolerant to software and hardware failures does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with File server system tolerant to software and hardware failures, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and File server system tolerant to software and hardware failures will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2902019