Fault tolerant data storage system

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06732289

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to data storage systems, and in particular, to a fault tolerant data storage system.
2. Description of the Related Art
Multiple storage controllers may be used to address the problem of storing and retrieving data when one storage controller fails.
FIG. 1
depicts a simplified representation of a conventional data storage system
100
with redundant storage controllers. The redundant storage controllers
102
-
1
and
102
-
2
are coupled between a processor
110
(e.g., server) and one or more storage devices
104
-
1
through
104
-N (e.g., disk drives). One storage controller serves as a primary controller and the other controller serves as a secondary controller. In a normal mode, the processor
110
accesses one or more of the storage devices via the primary controller
102
-
1
. If the primary controller
102
-
1
is detected to have failed by the processor
110
, the secondary controller
102
-
2
becomes active and assumes the interfacing operations between the storage devices and the processor
110
. When the controller
102
-
1
recovers, it may take over the storage devices again from the controller
102
-
2
.
FIG. 2
depicts a simplified representation of another conventional storage system
200
with redundant storage controllers. In this example, a heartbeat mechanism
206
is provided between the redundant storage controllers
202
-
1
and
202
-
2
so that each storage controller can send a heartbeat signal to the other storage controller to periodically indicate that it is functioning properly. At least in some implementations, each storage controller determines if the other storage controller is operating normally. If one of the storage controllers determines that the other storage controller has failed, it will initiate the process of taking over the disk drives serviced by the failing storage controller.
These conventional redundant storage controller systems suffer from various disadvantages. For example, the state of the failing controller may be unpredictable, i.e., the failing controller may not be completely down or completely up. Consequently, it is possible that sometime after a surviving controller takes over disk drives that were being serviced by a failing controller, the failing controller not realizing that it has failed may become active (if it had hung) and start executing requests in its queue. If one controller repeats operations that have already been executed by the other controller, data may become corrupted and may not be trusted. Additionally, when the surviving controller takes over the disk drives, there may be some operations that have already been executed by the failing controller on the disk drives but have not yet been committed to the processor. As a result, the surviving controller may attempt to perform operations that have already been executed by the failing controller. As previously mentioned, data may be corrupted if the surviving controller repeats the operations that have already been executed by the failing controller.
Thus, there is a need to provide a system which addresses problems associated with failing over a storage device from one storage controller to another storage controller.
SUMMARY OF THE INVENTION
According to one aspect of the invention, a fault tolerant data storage system for effectively failing over a storage device from one storage controller to another storage controller is provided. The storage system generally includes at least two storage controllers for coupling to a processor and at least one storage device. A failover manager is in communication with the storage controllers and the storage device. The failover manager assists failing over of the storage device by allowing only one of the storage controllers having ownership to access the storage device at any one time. The failover manager maintains a list of recent requests that have been committed to the storage device so that it can be used during failover to assist the surviving controller to complete the uncommitted requests properly.
In one embodiment, the failover manager is embodied in the form of a software task executed by a processor included in a disk controller of a disk drive. In an alternative embodiment, the software task is executed by a processor included in a separate electronic unit coupled between storage controllers and one or more disk drives.


REFERENCES:
patent: 5922077 (1999-07-01), Espy et al.
patent: 5928367 (1999-07-01), Nelson et al.
patent: 5966300 (1999-10-01), Flood et al.
patent: 6247099 (2001-06-01), Skazinski et al.
patent: 6279078 (2001-08-01), Sicola et al.
patent: 6363462 (2002-03-01), Bergsten
patent: 2001/0008019 (2001-07-01), Vert et al.
patent: 551718 (1993-07-01), None
patent: 681239 (1995-11-01), None
patent: 875832 (1998-11-01), None
T10/995D, SCSI-3 Primary Commands, Mar. 28, 1997.
Sicola S.J. “The Architecture and Design of HS-Series Storageworks Array Controllers” Digital Technical Journal, Maynard, MA, US, vol. 6, No. 4, 1994.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Fault tolerant data storage system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Fault tolerant data storage system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fault tolerant data storage system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3215442

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.