Fibre channel data storage system fail-over mechanism

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S004110, C714S005110, C714S006130

Reexamination Certificate

active

06571355

ABSTRACT:

BACKGROUND OF THE INVENTION
This invention relates generally to data storage systems and more particularly to data storage systems having a plurality of magnetic storage disk drives in a redundancy arrangement whereby the disk drives are controllable by first disk controllers and second disk controllers. Still more particularly, the invention also relates to systems of such type wherein the disk drives are coupled to the disk controllers through a series, unidirectional, “ring” or, fiber channel protocol, communication system.
As is known in the art, in one type of data storage system, data is stored in a bank of magnetic storage disk drives. The disk drives, and their coupled interfaces, are arranged in sets, each set being controlled by a first disk controller and a second disk controller. More particularly, in order to enable the set of disk drives to operate in the event that there is a failure of the first disk controller, each set is also coupled to a second, or redundant disk controller. Therefore, if either the first or second disk controller fails, the set of disk drives is accessible by the other one of the disk controllers.
While today most disk storage systems of this type use a Small Computer System Interconnection (SCSI) protocol, in order to operate with higher data rates, other protocols are being introduced. One higher data rate protocol is sometimes referred to as a fibre channel (FC) protocol. Such FC channel protocol uses a series, unidirectional, “ring” communication system. In order to provide for redundancy, that is, to enable use of the set of disk drives in the event that the first disk controller fails, as discussed above, the set is coupled to the second, or redundant disk controller, using a separate, independent, “ring”, or fibre channel communication protocol. Thus, two fibre channels are provided for each set of disk drives and their disk interfaces; a first fibre channel and a second fibre channel.
As is also known, when using the fibre channel communication protocol, if any element in the channel becomes inoperative, the entire channel becomes inoperative. That is, if the first disk controller becomes inoperative, or if any one of the disk drives in the set coupled to the first channel becomes inoperative (i.e., as where the disk interface fails, the disk interface is inoperative, or removed with its coupled disk drive, or where the disk drive coupled thereto fails, or is removed), the first fibre channel, is “broken”, or open, and becomes inoperative. The data stored in the entire portion of the set of disk drives coupled to the first disk channel is therefore unavailable until the inoperative first disk controller or inoperative disk drive is replaced. This is true with either the first channel or the second channel. One technique suggested to solve this problem is through the use of a switch, sometimes referred to as an LRC (i.e., a loop resiliency circuit) switch. Such LRC switch is used to remove an inoperative disk drive from its channel.
In one suggested arrangement, a printed circuit board is provided for each disk drive. The printed circuit board has a pair of LRCs, one for the first channel and one for the second channel. Thus, the open channel may be “closed” in the event of an inoperative disk drive by placing the LRC thereof in a by-pass condition. While such suggested technique solves the inoperative disk drive, or open channel problem, if one of the pair of LRCs fails, the entire printed circuit board having the pair of LRCs must be replaced thereby disrupting both the first and second channels; and, hence, disrupting the operation of the entire data storage system.
One technique suggested to solve this disruption problem requires n LRC switches (where n is the number of disk drives in the set) in the first channel, i.e., one LRC for each one the n disk drives in the set and another n LRC switches in the second channel for each one of the n disk drives in the second channel. The first channel set of n LRCs is mounted on one printed circuit board and the second channel set of n LRCs is mounted on a different printed circuit board. A backplane is used to interconnect the two LRC printed circuit boards, the associated selectors, or multiplexers, and the disk drives. In order to provide the requisite serial, or sequential, fibre channel connections, an elaborate, complex, fan-out wiring arrangement has been suggested for the backplane. Further, the slots provided for the two LRC boards eliminates two disk drives, and the disk interfaces which would otherwise be plugged into these two slots of the backplane.
Another fibre channel arrangement is described in U.S. Pat. No. 5,729,763 entitled “Data Storage System”, inventor Eli Leshem, issued Mar. 17, 1998, assigned to the same assignee as the present invention.
SUMMARY OF THE INVENTION
In accordance with the invention, a fibre channel system is provided having a plurality of disk drives. Each one of the disk drives has a pair of redundant ports. A pair of sources of data is provided. The system includes a pair of fibre channel port by-pass cards. Each one of the cards has an input/output port connected to a corresponding one of the sources of data. Each one of the port by-pass cards provides a fibre channel loop between the input/output port thereof and a corresponding one of the pair of ports of a one, or ones, of the disk drives selectively in accordance with a control signal fed to such port by-pass card by the one of the pair of sources coupled to the input/output port thereof. Each one of the port by-pass cards has a fail-over controller and a switch, such switch being coupled to the input/output port of such one of the port by-pass cards. Each one of the fail-over controllers produces a control signal from the source coupled thereto indicating a fault in the other one of the sources. The control signal activates the switch in the port by-pass card coupled to said other one of the sources to de-coupled such other one of the sources from the disk drives.
In accordance with another feature of the invention, the fail-over controller is configured to receive a control sequence from the source coupled thereto to detect hardware failures in a control bus between the source coupled thereto and the fail-over controller by forcing the bus state from an idle state to Command Verify state to start the control sequence.


REFERENCES:
patent: 5206939 (1993-04-01), Yanai et al.
patent: 5212785 (1993-05-01), Powers et al.
patent: 5729763 (1998-03-01), Leshem
patent: 5890214 (1999-03-01), Espy et al.
patent: 5898828 (1999-04-01), Pignolet et al.
patent: 5922077 (1999-07-01), Espy et al.
patent: 5991891 (1999-11-01), Hahn et al.
patent: 6038618 (2000-03-01), Beer et al.
patent: 6061753 (2000-05-01), Ericson
patent: 6118776 (2000-09-01), Berman
patent: 6138199 (2000-10-01), Fleischer
patent: 6154791 (2000-11-01), Kimble et al.
patent: 6185203 (2001-02-01), Berman
patent: 6192027 (2001-02-01), El-Batal
patent: 6195703 (2001-02-01), Blumenau et al.
patent: 6219753 (2001-04-01), Richardson
patent: 6260079 (2001-07-01), White
patent: 6282169 (2001-08-01), Kiremidjian
patent: 6338110 (2002-01-01), van Cruyningen
patent: 6389494 (2002-05-01), Walton et al.
patent: 2002/0012342 (2002-01-01), Oldfield et al.
patent: 0 550 853 (1993-07-01), None
patent: 0 751 464 (1997-01-01), None
patent: 0 889 410 (1999-01-01), None
patent: WO 97/07458 (1997-02-01), None
patent: WO 98/28882 (1998-07-01), None
patent: WO 99/26146 (1999-05-01), None
“Bypass Bus Mechanism for Direct Memory Access Controllers”, IBM Technical Disclosure Bulletin, vol. 33, No. 11, Apr. 1991.
Kumar Malavalli, “High Speed Fibre Channel Switching Fabric Services”; Proceedings of the SPIE, vol. 1577, pp. 216-225, Sep. 4, 1991.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Fibre channel data storage system fail-over mechanism does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Fibre channel data storage system fail-over mechanism, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fibre channel data storage system fail-over mechanism will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3038210

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.