Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-11-13
2001-12-11
Le, Dieu-Minh (Department: 2181)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S002000, C714S004110, C714S005110, C714S006130, C714S011000, C714S012000
Reexamination Certificate
active
06330687
ABSTRACT:
CROSS-REFERENCE TO RELATED APPLICATIONS
Not Applicable.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
Not Applicable.
MICROFICHE APPENDIX
Not Applicable.
BACKGROUND OF THE INVENTION
(1) Field of the Invention
This invention relates to systems in which multiple controllers are used to control an array of storage devices.
(2) Description of Related Art Including Information Disclosed Under 37 CFR 1.97 and 37 CFR 1.98.
The acronym RAID refers to systems which combine disk drives for the storage of large amounts of data. In RAID systems the data is recorded by dividing each disk into stripes, while the data are interleaved so the combined storage space consists of stripes from each disk. RAID systems fall under 5 different architectures, plus one additional type, RAID-
0
, which is simply an array of disks and does not offer any fault tolerance. RAID
1
-
5
systems use various combinations of redundancy, spare disks, and parity analysis to achieve conservation reading and writing of data in the face of one and, in some cases, multiple intermediate or permanent disk failures. Ridge, P. M.
The Book Of SCSI: A Guide For Adventurers
. Daly City Calif. No Starch Press.
1995
p. 323-329. In this application, a RAID system consisting of one host computer, one controller, and an array of multiple channels, each channel consisting of several direct access storage devices in serial electrical connection, will be termed a “single RAID subsystem”.
Conventional RAID systems guard against failure of a controller by the active-active system. This system consists of two single RAID subsystems, each with a host computer, a controller, and an array of direct access storage units. The direct access storage units, in the most common case, disks, are arranged in channels in which the disks are connected in a series. A common arrangement is for one controller to control six channels of five disks in each channel. In the active-active system, each channel of one system is connected electrically to another channel in another system. This means that, in the event of the failure of one controller, the other controller can serve all 10 disks in each “double” channel. Unfortunately, during normal operation when both controllers are operating there is interference associated with the fact that two controllers are simultaneously accessing a double channel of ten disks. This interference reduces the speed of a normally acting active-active system to about 130% of the speed of a single RAID subsystem rather than the 200% of a single RAID subsystem expected from the operation of two single RAID subsystems.
U.S. Pat. No. 5,768,623 discloses a system for storing data for several host computers an several storage arrays which are linked so that each storage array can be accessed by any host computer. The system uses dual ported disks and involves serial communication channels. No switches or repeaters are used to isolate the disk arrays during normal functioning of host computer and storage array controllers.
U.S. Pat. No. 5,729,763 discloses a system for storing data in which each of a number of disk interfaces is coupled to a corresponding disk drive by unidirectional channels. Each disk interface includes a unidirectional switch. Use of the switches allows a defective disk drive or switch to be removed without requiring shut-down of the entire system.
The RAID systems of the prior art do not provide the advantages of the present invention, that of increasing the overall speed of N same-speed single RAID subsystems to N times the speed of a single RAID system under normal conditions while providing for the sharing of multiple storage devices during conditions in which a host computer or storage array controller fails.
The system of the present invention is like the conventional active-active system except it incorporates a switch or repeater which isolates the channels of the two or more single RAID subsystems when all the host computers and controllers are functioning properly. If three same speed single RAID subsystems are included, for example, the system functions at 300% the speed of a single RAID subsystem during the vast preponderance of the time when all of the host computers and storage array controllers are functioning properly. In the case of a host computer or storage array controller failure, however, the bidirectional switch or bidirectional repeater closes and establishes electrical connection between the single RAID subsystem with the failure and the single RAID subsystem adjacent to it in the system. In this configuration the system has the speed expected of a conventional active-active system, after a host computer or storage array controller failure, about 100% of the speed of an individual RAID subsystem for the two affected single RAID subsystems. The remaining unaffected single RAID subsystems continue to operate at the unhindered maximum speed.
BRIEF SUMMARY OF THE INVENTION
The redundant RAID system of this invention extends the protection of the operation of a RAID system from providing for disk failure to providing for host computer or storage array controller failure. This invention consists of two or more (N) single RAID subsystems which are linked through the disk channels by a bidirectional switch or bidirectional repeater which is normally in the open position. Thus the system normally functions as (N) independent single RAID subsystems and functions at the speed of one single RAID subsystem multiplied by N if the single RAID subsystems all have the same speed. If the speed of the single RAID subsystems vary, the system normally functions at a speed which is the sum of the single RAID subsystems. In the event of a host computer or storage array controller failure, the bidirectional switch or repeater between two adjacent single RAID systems is changed to the closed position and the channels of disks of the functioning controller are electrically linked to the channels of disks of the disabled system. The functioning controller thus takes over the function of the disabled controller and provides continuing service, albeit at a reduced speed. The unaffected single RAID subsystems of the redundant RAID system of this invention continue to function unhindered.
In the normal operating mode the present invention enables each storage array controller to communicate with a set of disks independently of any other controller, thus operating the redundant RAID system at the speed of N single RAID subsystems. In the event of failure of one of the host computers or storage array controllers of a component single RAID subsystem, the system automatically assumes the configuration of a conventional active-active system with respect to the affected single RAID subsystem and the adjacent unaffected single RAID subsystem. The redundant RAID system continues to operate with access by the functioning adjacent RAID subsystem host computer and storage array controller to all of the disks of the failed and the functioning single RAID subsystems, although at a reduced speed.
Two advantages are associated with the present invention.
Firstly, a host computer and storage array controller redundant RAID system with a normal speed much higher than the conventional active-active host computer and storage array controller redundant systems is provided. In the event of failure of a host computer or storage array controller the speed of the system is no lower than that of a conventional host computer and storage array controller redundant system. If greater than two single RAID subsystems are included in the redundant RAID system, the speed of the system under nearly all conditions is greater than the conventional redundant system.
Secondly, the use of bidirectional repeater switching means allows the use of relatively long cables linking the disk channels, and provides additional flexibility in the physical location of the single RAID subsystem components of the invention.
The objective of this invention is to provide a host computer and storage array controller redundant RAID system which conti
Digi-Data Corporation
Le Dieu-Minh
Ramsey William S.
Vo Tim
LandOfFree
System and method to maintain performance among N single... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method to maintain performance among N single..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method to maintain performance among N single... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2584295