Coordinating persistent status information with multiple...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06496942

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to computer systems.
2. Related Art
Computer storage systems are used to record and retrieve data. It is desirable for the services and data provided by the storage system to be available for service to the greatest degree possible. Accordingly, some computer storage systems provide a plurality of file servers, with the property that when a first file server fails, a second file server is available to provide the services, and the data otherwise provided by the first. The second file server provides these services and data by takeover of resources otherwise managed by the first file server.
One problem in the known art is that when two file servers each provide backup for the other, it is important that each of the two file servers is able to reliably detect failure of the other and to smoothly handle any required takeover operations. It would be advantageous for this to occur without either of the two file servers interfering with proper operation of the other. This problem is particularly acute in systems when one or both file servers recover from a service interruption.
Accordingly, it would be advantageous to provide a storage system and a method for operating a storage system, that provides for relatively rapid and reliable takeover among a plurality of independent file servers. This advantage is achieved in an embodiment of the invention in which each file server (a) maintains redundant communication paths to the others, (b) maintains its own state in persistent memory at least some of which is accessible to the others, and (c) regularly confirms the state of the other file servers.
SUMMARY OF THE INVENTION
The invention provides a storage system and a method for operating a storage system, that provides for relatively rapid and reliable takeover among a plurality of independent file servers. Each file server maintains a reliable (such as redundant) communication path to the others, preventing any single point of failure in communication among file servers. Each file server maintains its own state in reliable (such as persistent) memory at least some of which is accessible to the others, providing a method for confirming that its own state information is up to date, and for reconstructing proper state information if not. Each file server regularly confirms the state of the other file servers, and attempts takeover operations only when the other file servers are clearly unable to provide their share of services.
In a preferred embodiment, each file server sequences messages on the redundant communication paths, so as to allow other file servers to combine the redundant communication paths into a single ordered stream of messages. Each file server maintains its own state in its persistent memory and compares that state with the ordered stream of messages, so as to determine whether other file servers have progressed beyond the file server's own last known state. Each file server uses the shared resources (such as magnetic disks) themselves as part of the redundant communication paths, so as to prevent mutual attempts at takeover of resources when each file server believes the other to have failed.
In a preferred embodiment, each file server provides a status report to the others when recovering from an error, so as to prevent the possibility of multiple file servers each repeatedly failing and attempting to seize the resources of the others.


REFERENCES:
patent: 4456957 (1984-06-01), Schieltz
patent: 4710868 (1987-12-01), Cocke et al.
patent: 4719569 (1988-01-01), Ludemann et al.
patent: 4937763 (1990-06-01), Mott
patent: 5049873 (1991-09-01), Robins et al.
patent: 5067099 (1991-11-01), McCown et al.
patent: 5088081 (1992-02-01), Farr
patent: 5155835 (1992-10-01), Row et al.
patent: 5257391 (1993-10-01), DuLac et al.
patent: 5274799 (1993-12-01), Brant et al.
patent: 5278838 (1994-01-01), Ng et al.
patent: 5305326 (1994-04-01), Solomon et al.
patent: 5341381 (1994-08-01), Fuller
patent: 5355453 (1994-10-01), Row et al.
patent: 5357509 (1994-10-01), Ohizumi
patent: 5357612 (1994-10-01), Alaiwan
patent: 5379417 (1995-01-01), Lui et al.
patent: 5390187 (1995-02-01), Stallmo
patent: 5398253 (1995-03-01), Gordon
patent: 5452444 (1995-09-01), Solomon et al.
patent: 5454095 (1995-09-01), Kraemer et al.
patent: 5497422 (1996-03-01), Tysen et al.
patent: 5504883 (1996-04-01), Covertson et al.
patent: 5537567 (1996-07-01), Hitz et al.
patent: 5572711 (1996-11-01), Hirsch et al.
patent: 5604862 (1997-02-01), Midgely et al.
patent: 5621663 (1997-04-01), Skagerling
patent: 5668943 (1997-09-01), Attanasio et al.
patent: 5675726 (1997-10-01), Hohenstein et al.
patent: 5678006 (1997-10-01), Valizadeh et al.
patent: 5721916 (1998-02-01), Pardikar
patent: 5729685 (1998-03-01), Chatwani et al.
patent: 5781716 (1998-07-01), Hemphill et al.
patent: 5819292 (1998-10-01), Hitz et al.
patent: 5819310 (1998-10-01), Vishlitzky
patent: 5841997 (1998-11-01), Bleiweiss et al.
patent: 5856981 (1999-01-01), Voelker
patent: 5862312 (1999-01-01), Mann et al.
patent: 5950203 (1999-09-01), Stakuis et al.
patent: 5996086 (1999-11-01), Delaney
patent: 6199099 (2000-03-01), Gershmann et al.
patent: 6098155 (2000-08-01), Chong Jr.
patent: 6101507 (2000-08-01), Cane et al.
patent: 001860 (2000-09-01), Asthana et al.
patent: 6119244 (2000-09-01), Schoenthal
patent: 6134673 (2000-10-01), Chrabaszcz
patent: 6138126 (2000-10-01), Hitz et al.
patent: 6163853 (2000-12-01), Findlay
patent: 6275953 (2001-08-01), Vahalia et al.
patent: 6279011 (2001-08-01), Mulhestein
patent: 6289356 (2001-09-01), Hitz et al.
patent: 6317844 (2001-11-01), Kleiman
patent: 2001/0039622 (2001-11-01), Hitz et al.
patent: 2001/0044807 (2001-11-01), Kleiman et al.
patent: 2002/0007470 (2002-01-01), Kleiman
patent: 2002/0049718 (2002-04-01), Kleiman et al.
patent: 0 306 244 (1989-03-01), None
patent: 0 308 056 (1989-03-01), None
patent: 0 321 723 (1989-06-01), None
patent: 0 410 630 (1991-01-01), None
patent: 0 492 808 (1992-07-01), None
patent: 0537098 (1993-04-01), None
patent: 0 569 313 (1993-11-01), None
patent: 0 747 829 (1996-12-01), None
patent: 0 760 503 (1997-03-01), None
patent: 1 031 928 (2000-11-01), None
patent: 5-197495 (1993-08-01), None
patent: 7-261947 (1995-10-01), None
patent: WO 94/29795 (1994-12-01), None
patent: WO 94/29796 (1994-12-01), None
patent: WO 98/38576 (1998-09-01), None
patent: WO 99/46680 (1999-09-01), None
patent: WO 00/07104 (2000-02-01), None
patent: WO 00/11553 (2000-03-01), None
patent: WO 01/14991 (2001-03-01), None
patent: WO 01/31446 (2001-05-01), None
patent: WO 01/43368 (2001-06-01), None
patent: WO 02/29572 (2002-04-01), None
Steven Kleiman, “Using Numa Interconnects for Highly Available Filers”, 1999 IEEE. XP-002164052.
Timothy. Slashdot. Tux2: “The Filesystem That Would Be King”. Microsoft Internet Explorer. Oct. 20, 2000.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Coordinating persistent status information with multiple... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Coordinating persistent status information with multiple..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Coordinating persistent status information with multiple... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2941580

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.