Reparity bitmap RAID failure recovery

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S114000

Reexamination Certificate

active

06799284

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to recovery of parity data on electronic data mass storage systems known as RAID (Redundant Array of Independent Disks).
2. Related Art
RAID is a popular and well-known method used for storage and retrieval of data. It offers a data source that can be made readily available to multiple users with a high degree of data security and reliability.
In general, RAID is available in several configurations known as levels. Each of these levels offers at least one performance enhancement over a single drive (e.g. data mirroring, faster reads, data recovery). A popular feature of RAID, and probably the justification for its use in so many systems, is the ability to reconstruct lost data from parity information that is recorded along with the other data. Committing such large amounts of data to a RAID places a lot of trust in the RAID concept that data will be recoverable using the parity data in the event a failure occurs.
Problems can arise when a failure does occur and both the parity data and the other stored data are damaged. Without the parity information, it is impossible to recompute missing data.
A first known method used to combat this weakness is to log RAID stripes as they are written. In the event a crash occurs, the log can be used to determine which blocks should have their associated redundancy information recomputed. Variants of this technique include: logging the actual data, logging time-stamps and block numbers of blocks written, and logging stripe numbers and parity information to non-volatile memory.
Logs reduce the amount of parity information that has to be reconstructed on the RAID, which in turn reduces the amount of time that the array contains unprotected data. While the use of logs can combat some of the weakness in RAID implementation, it can require excessive overhead to maintain which in turn reduces data transfer rates. Additionally, data can be lost when logs are compromised.
A second known method is to “stage” the data and parity information to a pre-write area. Following a crash, the system can copy the data/parity information from the pre-write area to the RAID array. Use of a pre-write area requires data to be written twice; once to the pre-write area and then again to the actual stripe(s) in the array. This provides a more secure write transaction at the cost of reducing data transfer speed.
Accordingly, it would be desirable to provide a technique for enabling RAID failure recovery without the severe drawbacks of the known art.
SUMMARY OF THE INVENTION
The invention provides a method and system for RAID failure recovery due to a system crash that can function independently or as a supplemental and redundant recovery method to other RAID recovery strategies. A reparity bitmap is created with each bit representing N stripes within the RAID. When a write occurs to a stripe, the associated reparity bit is set to 1; otherwise the bit is set to its default value of zero.
Each bit in the reparity bitmap has an associated in-memory write counter. The write counter is used to track the number of writes in progress to a stripe range. Upon initiation of the first write to a stripe range, the reparity bit for the stripe range is set, and the write counter is incremented from its default value to indicate that one write is in progress. Subsequent, concurrent writes, cause the write counter to be incremented.
Upon completion of a write to the stripe range, the write counter is decremented. When all writes to the stripe range have been completed, the write counter will have returned to its default value, the reparity bit is cleared, and the reparity bitmap is written to disk. Using the write counter allows multiple writes to a stripe range without incurring two extra write I/Os (for the bitmap) per stripe write which greatly reduces overhead.
The writer first checks the reparity bitmap prior to executing a write. If the bit associated with that stripe is zero, the write counter is incremented for that reparity bitmap bit and the reparity bit is set to 1. The writer can proceed with the stripe write once the reparity bitmap is written to disk.
In the event the reparity bit is already set to 1, the writer increments the write counter and checks to see if the reparity bitmap is in the process of being written to disk. If the reparity bitmap is in the process of being written to disk, the writer waits for the reparity bitmap to be written and then writes the stripe; otherwise, the writer does not need to wait and writes the stripe without waiting.
If a system crash occurs, the reparity bitmap identifies those stripes that were in the process of being written—all other stripes are assured to be consistent. On reboot, the reparity bitmap is read by the RAID system and, if needed, recomputation of the data using parity information occurs on only those stripes whose associated reparity bit is set.
This summary has been provided so that the nature of the invention may be understood quickly. A more complete understanding may be obtained by reference to the following description of the preferred embodiments in combination with the attached drawings.


REFERENCES:
patent: 5195100 (1993-03-01), Katz et al.
patent: 5208813 (1993-05-01), Stallmo
patent: 5222217 (1993-06-01), Blount et al.
patent: 5235601 (1993-08-01), Stallmo et al.
patent: 5255270 (1993-10-01), Yanai et al.
patent: 5274799 (1993-12-01), Brant et al.
patent: 5274807 (1993-12-01), Hoshen et al.
patent: 5315602 (1994-05-01), Noya et al.
patent: 5379417 (1995-01-01), Lui et al.
patent: 5490248 (1996-02-01), Dan et al.
patent: 5502836 (1996-03-01), Hale et al.
patent: 5737744 (1998-04-01), Callison et al.
patent: 5948110 (1999-09-01), Hitz et al.
patent: 5950225 (1999-09-01), Kleiman
patent: 6073089 (2000-06-01), Baker et al.
patent: 6148368 (2000-11-01), DeKoning
patent: 6161165 (2000-12-01), Solomon et al.
patent: 6233648 (2001-05-01), Tomita
patent: 6327638 (2001-12-01), Kirby
patent: 6480969 (2002-11-01), Hitz et al.
patent: 2001/0002480 (2001-05-01), Dekoning et al.
patent: 2002/0035666 (2002-03-01), Beardsley et al.
patent: 0 462 917 (1991-12-01), None
patent: 0 462 917 (1991-12-01), None
patent: 0 462 917 (1991-12-01), None
patent: 0 492 808 (1992-07-01), None
patent: 0 492 808 (1992-07-01), None
patent: 0 492 808 (1992-07-01), None
patent: 1 031 928 (2000-08-01), None
patent: 1 031 928 (2000-08-01), None
patent: WO 94/29795 (1994-12-01), None
patent: WO 99/45456 (1999-09-01), None
Yamamoto Akira; Disk Array Controller; EPPatent Abstract of Japan; vol. vol. 17; No. 621;p. 1.
Specification: Fly-By-Xor.
Gray, Jim et al.; Parity Striping of Disc Arrays: Low-Cost Reliable Storage with Acceptable Throughput; Tandem Computers Inc., 1993 Vallco Parkway, Cupertino, California. XP 000522459.
Jai Menon and Jim Cortney; The Architecture of a Fault-Tolerant Cached RAID Controller; IBM Almaden Research Center; San Jose, California. Technical Disclosure Bulletin; vol. 36 No. 3; 1993 XP 000398988.
Patterson et al.; A Case For Redundant Arrays of Inexpensive Disks (RAID); Computer Science Division Dept. of Electrical Engineering and Computer Sciences; 571 Evans Hall; University of California, Berkeley.
Slashdot: Tux2: The Filesystem That Would Be King—Mircrosoft Internet Explorer. Oct. 20, 2000.
IBM Technical Disclosure Bulletin. vol. 36. No. 3 Mar. 1993. Parity Preservation for Redundant Array of Independent Direct Access Storage Device Data Loss Minimization and Repair. XP000354845.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Reparity bitmap RAID failure recovery does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Reparity bitmap RAID failure recovery, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reparity bitmap RAID failure recovery will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3265507

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.