Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2001-01-23
2004-07-27
Beausoliel, Robert (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S026000
Reexamination Certificate
active
06769071
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to computer storage, and more particularly to failover techniques for multi-path storage systems.
2. Description of the Related Art
Computer storage systems, such as disk drive systems, have grown enormously in both size and sophistication in recent years. These systems typically include many large storage units controlled by a complex multi-tasking controller. Large scale computer storage systems generally can receive commands from a large number of host computers and can control a large number of mass storage elements, each capable of storing in excess of several gigabytes of data.
FIG. 1
is an illustration showing a prior art computer storage system
100
. The prior art computer storage system
100
includes computer systems
102
,
104
, and
106
, and workstations
108
and
110
all coupled to a local area network
112
. The computer systems
102
,
104
, and
106
are also in communication with storage devices
114
via a storage area network
116
. Generally, the computer systems
102
,
104
, and
106
can be any computer operated by users, such as PCs, Macintosh, or Sun Workstations. The storage devices can be any device capable of providing mass electronic storage, such as disk drives, tape libraries, CDs, or RAID systems.
Often, the storage area network
116
is an Arbitrated Loop, however, the storage area network
116
can be any storage area network capable of providing communication between the computer systems
102
,
104
, and
106
, and the computer storage devices
114
. Another typical storage area network is a Fabric/switched storage area network, wherein the storage area network
116
comprises several nodes, each capable of forwarding data packets to a requested destination.
In use, the computer systems
102
,
104
, and
106
transmit data to the storage devices
114
via the storage area network
116
. The storage devices
114
then record the transmitted data on a recording medium using whatever apparatus is appropriate for the particular medium being used. Generally the conventional computer storage system
100
operates satisfactorily until a failure occurs, which often results in data loss that can have catastrophic side effects.
It is more than an inconvenience to the user when the computer storage system
100
goes “down” or off-line, even when the problem can be corrected relatively quickly, such as within hours. The resulting lost time adversely affects not only system throughput performance, but also user application performance. Further, the user is often not concerned whether it is a physical disk drive, or its controller that fails, it is the inconvenience and failure of the system as a whole that causes user difficulties.
As the systems grow in complexity, it is increasingly less desirable to have interrupting failures at either the device or at the controller level. As a result, efforts have been made to make systems more reliable and increase the mean time between failures. For example, redundancy in various levels has been used as a popular method to increase reliability. Redundancy has been applied in storage devices, power supplies, servers, and in host controllers to increase reliability.
A problem with incorporating redundancy into a computer system is that redundancy often causes additional problems with system performance and usability. For example, if redundancy in the form of multiple drive paths to a single device is used in an attempt to increase the reliability of a conventional system, the operating system is often confused into believing two separate physical drives are available to receive storage data, when only one physical drive is actually available.
In view of the foregoing, there is a need for method that can continue to provide access to I/O devices when a data path to the I/O device experiences a failure. The method should have the capability to automatically detect the failure and act to address the failure in manner that is transparent to the user. The method should be capable of increasing system reliability while not interfering with the production of the user.
SUMMARY OF THE INVENTION
Broadly speaking, the present invention fills these needs by providing an intelligent failover method, which automatically detects failure and recovers by rerouting I/O requests via an alternate data path. In one embodiment, a method for intelligent failover in a multi-path computer system is disclosed. Initially, a plurality of data paths to a computer input/output (I/O) device is provided. However, instead of the user viewing multiple logical devices for the single I/O device, embodiments of the present invention represent the plurality of data paths to the computer I/O device as a single logical computer I/O device. Then, during operation, an I/O request to access the computer I/O device is intercepted. A data path from the plurality of data paths to the computer I/O device is selected, and the computer I/O device accessed using the selected data path.
In another embodiment, a system for intelligent failover in a multi-path computer system is disclosed. The system includes a processor and a computer I/O device placed in communication with the processor via a plurality of data paths. In addition, a user interface module is included that is in communication with the plurality of data paths. The user interface module is used to represent the plurality of data paths to a user as a single logical computer I/O device. In addition, the user interface can be used to configure the failover system to fit a particular use or hardware configuration. The system also includes a failover filter driver that is in communication with the plurality of data paths. In operation, the failover filter driver selects a particular data path from the plurality of data paths to access the computer I/O device for intercepted I/O requests.
A failover filter driver for providing intelligent failover in a multi-path computer system is disclosed in another embodiment of the present invention. Included in the failover filter driver is an intercept code module that intercepts I/O request to a computer I/O device from an operating system. In addition, a manual-select code module is included that selects a data path from a plurality of data paths to the computer I/O device based on data path information provided from a requesting computer application. The failover filter driver further includes an auto-select code module that selects a data path based on characteristics of each data path in the plurality of data paths to the computer I/O device.
Advantageously, the embodiments of the present invention provide intelligent failover in multi-path computer systems, which greatly increases system reliability. Since data paths can fail, either because of a failed connection, failed controller, or any other reason, the ability to automatically detect failures and reroute data to alternate paths greatly increases system reliability. Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
REFERENCES:
patent: 6341356 (2002-01-01), Johnson et al.
patent: 6408348 (2002-06-01), Blount et al.
patent: 6636239 (2003-10-01), Arquie et al.
patent: 2002/0065962 (2002-05-01), Bakke et al.
patent: 2003/0005119 (2003-01-01), Mercier et al.
patent: 2003/0172331 (2003-09-01), Cherian et al.
patent: 2003/0182504 (2003-09-01), Nielsen et al.
patent: 2003/0200477 (2003-10-01), Ayres
Cheng Eric
Ding Yafu
Wu Chang-Tying
Adaptec, Inc.
Beausoliel Robert
Duncan Marc
Martine & Penilla LLP
LandOfFree
Method and apparatus for intelligent failover in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for intelligent failover in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for intelligent failover in a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3252535