Software recognition of drive removal or insertion in a...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S114000

Reexamination Certificate

active

06178520

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to data storage subsystems, and more particularly, to a software method operable within controllers of the data storage subsystem for “hot-swap” detection and processing of disk drive units without special purpose circuits to indicate a disk drive insertion or removal event.
2. Discussion of Related Art
Modern mass storage systems are growing to provide increasing storage capacities to fulfill increasing user demands from host computer system applications. Due to this critical reliance on large capacity mass storage, demands for enhanced reliability are also high. A popular solution to the need for increased reliability is redundancy of component level subsystems. Redundancy is typically applied at many or all levels of the components involved in the total subsystem operation. For example in storage subsystems, redundant host systems may each be connected via redundant I/O paths to each of redundant storage controllers which in turn each may be connected through redundant I/O paths to redundant storage devices (e.g., disk drives).
In managing redundant storage devices such as disk drives it is common to utilize Redundant Array of Independent Disks (commonly referred to as RAID) storage management techniques. RAID techniques generally distribute data over a plurality of smaller disk drives. RAID controllers within a RAID storage subsystem hide this data distribution from the attached host systems such that the collection of storage (often referred to as a logical unit or LUN) appears to the host as a single large storage device.
To enhance (restore) the reliability of the subsystem having data distributed over a plurality of disk drives, RAID techniques generate and store in the disk drives redundancy information (e.g., XOR parity corresponding to a portion of the stored data). A failure of a single disk drive in such a RAID array of disk drives will not halt operation of the RAID subsystem. The remaining available data and/or redundancy data is used to recreate the data missing due to the failure of a single disk drive.
The 1987 publication by David A. Patterson, et al., from University of California at Berkeley entitled
A Case for Redundant Arrays of Inexpensive Disks
(RAID), reviews the fundamental concepts of RAID technology.
RAID techniques are therefore useful to sustain storage subsystem operation despite loss of a disk drive. However, the failed disk drive must eventually be replaced to restore the highest level of reliability in the subsystem. When a disk drive in the disk array fails, replacing the failed disk drive can affect data availability. Such a physical replacement is often referred to as a swap. It is common to refer to a swap where power to the subsystem must be shut off to affect the swap as a cold swap. A warm swap is one which may not require disconnection of power, but none-the-less requires the subsystem to completely reinitialize to continue operation with the replacement drive. Data availability and user service within the storage subsystem is disrupted by cold or warm swaps of the failed disk drive, because the system is not operational. During cold swaps, the system electrical power is turned off before replacing the failed disk drive. During warm swaps, the electrical power is not turned off, however, data is unavailable because insertion of a disk drive requires the shutdown and re-initialization of the disk array data storage system.
To avoid such disruption of service, hot swaps of failed disk drives are preferred. During a hot swap of a failed disk drive the subsystem remains operational and does not require shutdown of the system. The RAID management software within the controller(s) of the subsystem compensate for the failed disk drive using the other data and/or redundancy information to provide continued data availability to the user. Some RAID management software may even substitute a pre-installed spare disk drive for the failed disk drive to restore normal operation of the LUN (e.g., a logical operation to achieve an automatic swap). The failed disk drive, however, must eventually be physically replaced, preferably via hot swap, to maintain the highest levels of security and performance within the storage subsystem.
A problem encountered by RAID systems having the hot swap feature is correct recognition of a disk drive insertion or disk drive removal event. As is common in many electronic subsystems, insertion or removal of disk drives while powered on can generate undesirable transient signals (e.g., noise or “glitches”). Such signal glitches may confuse the storage subsystems controllers and control software operable therein thereby causing erroneous or unexpected conditions within the storage subsystem. Presently known RAID system designs utilize specialized hardware and electronic circuitry to detect the disk drive insertion or disk drive removal while simultaneously filtering or otherwise processing such transient signals.
For example, present RAID systems often utilize a SCSI bus internally to interconnect the plurality of disk drives (the disk arrays) to the RAID controller(s). Such systems typically utilize additional hardware and electronic circuitry to eliminate the erroneous transient bus signals. For example, in some commercially available storage subsystems, the disk drives are mounted in a “canister” which buffers the SCSI bus interface signals from the disk drive's SCSI bus interface connections. In other words, the canister attaches to the SCSI bus and the disk drive physically mounts within the canister and connects electronically through the buffers of the canister to the SCSI bus signals. The canister includes circuits (passive and or active circuits) to buffer the signals between the disk drive's SCSI bus interface connections and the SCSI bus signal paths. These circuits help reduce or prevent occurrences of such transient (noise or glitch) signals. In addition, the canister includes circuits which automatically perform a reset of the SCSI bus in response to the detection of such transient (noise or glitch) signals generated by insertion of the canister into a hot (powered on) SCSI bus. By resetting the bus, the canister in effect notifies the higher level control functions of the RAID controller of the possibility of a drive insertion or removal by asynchronously applying a reset to the SCSI bus. The higher level control functions of the RAID controller respond to the application of a reset to the SCSI bus by polling the devices on the SCSI bus and determining which previously present devices have been removed (if any) and which new devices have been added (inserted) into the SCSI bus.
Canisters and other similar circuits for enabling hot swaps add complexity (and therefore cost) to the storage subsystem. It is therefore a problem to properly recognize hot swap disk drives without the need for complex buffer circuits between the disk drives and their interconnection communication medium (e.g., without canisters for attaching disk drives to a SCSI bus).
SUMMARY OF THE INVENTION
The present invention solves the above and other problems, thereby advancing the useful arts, by providing a software method for hot swap detection and processing without requiring special purpose circuits to aid in proper detection of disk drive insertion or removal events. In particular, the methods and structure of the present invention perform specific test sequences dependent upon the present state of activity of a disk drive to determine when and whether a faulty drive has been replaced. The test sequences are invoked at a low level of the RAID control module software (e.g., the driver level) in response to various errors possibly generated by transient signals on the bus or by removal of a disk drive. Similar test sequences are invoked at a higher level of the RAID control software to periodically poll the devices on the bus to detect recently added or removed devices.
Specifically, the methods of the present invention are based upon low level soft

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Software recognition of drive removal or insertion in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Software recognition of drive removal or insertion in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Software recognition of drive removal or insertion in a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2516728

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.