Storage subsystem and information processing system

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S005110, C714S006130, C714S042000, C714S043000

Reexamination Certificate

active

06795934

ABSTRACT:

CROSS-REFERENCES TO RELATED APPLICATIONS
This application claims priority from Japanese Patent Application Reference No. P00-032873, filed Feb. 10, 2000.
BACKGROUND OF THE INVENTION
The present invention relates to techniques for use in a storage subsystem and an information processing system, and in particular to techniques for detecting and recovering from errors occurring in storage subsystems having two or more components linked together by a communication link with a loop topology such as the fibre channel loop.
Conventional high capacity storage subsystems can be comprised of two or more hard disk drives which are connected by a Fibre Channel (FC). In the connecting topology of the FC Loop (FIBRE CHANNEL ARBITRATED LOOP (FC-AL)), each drive and a controller which controls the drive in the storage subsystem are connected with one another by a loop topology. A port bypass circuit (PBC) is installed in a connecting part between each drive and the FC Loop in order to disconnect the drive from the FC Loop when the drive incurs a failure or is to be replaced by another drive.
The Fibre Channel, one of the super gigabit technologies, has been standardized under the name, “ANSI NCITS T11” (ANSI X3 T11 by former name).
While certain advantages are perceived, opportunities for further improvement exist. For example, according to conventional FC Loop technology, once the fibre channel loop is broken at any point, it becomes substantially impossible to communicate between a controller and each drive connected to the fibre channel loop.
What is needed are techniques for improving for detecting and recovering from errors occurring in disk drive subsystems having a controller and drive units connected by a fibre channel loop.
SUMMARY OF THE INVENTION
According to the invention, techniques for detecting and recovering from errors occurring in disk drive subsystems having a controller and drive units connected by a fibre channel loop are provided. Specific embodiments can provide storage subsystems, methods and apparatus for use in information processing environments, for example. Embodiments can determine when each drive is disconnected from the loop in the external storage subsystem structured by using the FC Loop, and thereupon, the FC Loop can be controlled by bridging the communication path using the PBC so that the loop is not broken.
An object of the present invention is to provide the storage subsystem equipped with the communicating means of loop topology, for preventing the decrease in the performance and/or reliability to the minimum, even if any failure occurs on the storage subsystem.
Another object of the present invention is to provide the storage subsystem equipped with the communicating means of loop topology, for determining the failing part and for recovering from the failure quickly, simply and precisely.
Another object of the present invention is to provide the storage subsystem equipped with multiple communicating means of loop topology, for recovering reliably from the multiple failure having influence upon the multiple loops of communicating means.
An object of the present invention is to provide the information processing system equipped with the communicating means of loop topology, for minimizing the decrease in the performance and/or reliability, even if any failure occurs in the information processing system.
Another object of the present invention is to provide the information processing system equipped with the communicating means of loop topology, for determining the failing part and for recovering from the failure in the processing system quickly, simply and precisely.
Another object of the present invention is to provide the information processing system equipped with multiple communicating means of loop topology, for recovering from multiple failure having influence upon the multiple loops communicating means.
In a representative embodiment according to the present invention, a storage subsystem is provided. The disk storage subsystem can include a plurality of storage drives, a plurality of controllers to control said storage drives, and a plurality of data communication loops to connect the storage drives and the controllers and to exchange information between the controllers and the storage drives, a first bypass mechanism that connects and disconnects at least one of each of the storage drives and each of the controllers individually to each of the communication loops, and a second bypass mechanism that bridges each of the communication loops at a specified location to selectively isolate a portion of the communication loop. Responsive to detecting a failure, at least one of the controllers commands at least one of the first and second bypass mechanisms to successively disconnect and re-connect each of the storage devices to each of the communication loops under control of the controller through the other of the communication loops, to locate a cause of the failure.
In another representative embodiment according to the present invention, an information processing system is provided. The information processing system can comprise a plurality of component units, each of which performs at least one of storing information and processing information, a data communication loop to connect the component units and to exchange information with each other within the component units, a first bypass mechanism to control the connection and disconnection of each of the component units individually to and from the communication loop, and a second bypass mechanism to bridge the communication loop at a specified location and to selectively isolate a part of the communication loop. Responsive to detecting a failure, at least one of the component units commands at least one of the first and second bypass mechanisms to successively disconnect and re-connect each of the component units to the data communication loop to locate a cause of the failure.
In a further representative embodiment according to the present invention, a storage subsystem is provided. The storage subsystem can comprise a plurality of storage devices, linked to a plurality of controllers to control the storage devices by a plurality of data communication loops. The communication loops connect the storage devices and the controllers to exchange information between the controllers and the storage devices. The storage subsystem can also comprise a first plurality of bypass switches. Each bypass switch operable to connect an associated one of the storage devices, and each of the controllers individually to each of the communication loops and to disconnect the associated one of the storage devices and the each of the controllers individually from each of the communication loops. A second plurality of bypass switches can also be part of the subsystem. Each switch can be operable to connect, in a first operating state, to a group of the plurality of storage devices and their respective associated bypass switches, for electrical signal communications with the one or more of the plurality of controllers. In a second operating state, the second plurality of bypass switches provides for electrically isolating the group of storage devices and their respective associated bypass switches from communicating with the at least one of a plurality of controllers, while maintaining other storage devices in the communication loop. Responsive to detecting a failure, at least one of the controllers commands at least one of the first and second plurality of bypass switches to disconnect and re-connect at least one of the storage devices to at least one of the communication loops under control of the controller through the other of the communication loops.
In a yet further representative embodiment according to the present invention, a method for detecting and recovering from errors occurring in disk drive subsystem is provided. The disk subsystem can have a plurality of controllers that control a plurality of storage devices, the controllers and storage devices interconnected by a plurality of communication loops, including

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Storage subsystem and information processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Storage subsystem and information processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Storage subsystem and information processing system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3263574

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.