Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1998-06-19
2001-04-17
Beausoleil, Robert (Department: 2785)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S006130
Reexamination Certificate
active
06219800
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to fault-tolerant disk storage methods and systems.
BACKGROUND OF THE INVENTION
The RAID-5 standard describes a fault-tolerant architecture for storing data on disk storage devices. A plurality of disk drives are arranged into a storage array. Data is stored in the array in units termed stripes. Each stripe is partitioned into sub-units termed blocks, with one block of each stripe stored on one disk drive in the array. The storage array is protected against single-disk drive failures by assigning one block in each stripe to be the parity block for the stripe. RAID-5 provides excellent performance for large consecutive reads and batch loads, because each block in a stripe may be accessed in parallel with each other block. However, RAID-5 storage arrays have poor performance for the small updates typically found in transaction processing, because the parity block must be updated after even a small update.
Several schemes have been proposed to overcome this performance problem. For example, the scheme proposed by Savage and Wilkes (“AFRAID—A Frequently Redundant Array of Independent Disks”, by Stefan Savage and John Wilkes, 1996 USENIX Technical Conference, Jan. 22-26, 1996) provides a greatly improved level of performance for RAID-5 arrays. This scheme defers the update to the parity block to periods in which the disk drive is idle, a situation which occurs frequently. However, this scheme also increases the vulnerability of the array to single disk drive failures, because of the likelihood that recently updated disk blocks will be lost when a disk drive fails.
The scheme proposed by Stodolsky et al. (“Parity Logging—Overcoming the Small Write Problem in Redundant Disk Arrays”, by Daniel Stodolsky, Garth Gibson and Mark Holland,
IEEE
1993, pp. 64-75) generates parity updates and logs them, rather than updating the parity immediately. When the log buffer is full, the parity updates are all written in one large update. This scheme preserves the reliability of the storage array, but only increases performance to the extent that the logging overhead plus the update overhead is less than the other overhead.
While the increased vulnerability of the Savage—Wilkes scheme may be tolerated in some applications, it is not acceptable in other applications, such as databases. A need arises for a technique which provides improved performance over standard RAID-5 without increasing vulnerability to single-disk drive failures.
SUMMARY OF THE INVENTION
The present invention is a storage system, and method of operation thereof, which provides improved performance over standard RAID-5 without increasing vulnerability to single-disk drive failures. The storage system comprises a processor and a plurality of data storage devices, coupled to the processor, operable to store a plurality of data stripes, each data stripe comprising a plurality of data blocks and a parity block, each data storage device operable to store one data block or the parity block of each data stripe. The storage system ensures that a parity-consistent image of a data stripe can be constructed in spite of single disk failures.
When an update to a data block in the data stripe is received, an image is stored of the data block as it was when the current parity-consistent image of the stripe was generated. The data block is updated and an image of the updated data block is stored. When a failure of one of the plurality of data storage devices is detected, the contents of the block on the failed device are generated. The parity block of a non-parity-consistent or dirty stripe is generated by computing a bitwise exclusive-OR of the image of each updated data block as it was when the parity-consistent image was generated, and the current image of each updated data block, to form an intermediate result. The parity block of the data stripe is read and a bitwise exclusive-OR of the intermediate result and the parity block is generated.
The generated parity block is written and a parity rebuild is performed on the data stripe using the new parity block.
REFERENCES:
patent: 5634109 (1997-05-01), Chen
patent: 5864655 (1999-01-01), Dewey
patent: 6148368 (2000-11-01), DeKoning
Chen et al. “ACM Computing Surveys: RAID High Performance Reliable Secondary Storage” vol. 2, No. 26, Jun. 1994.*
E. Gabber et al., “Data Logging: A Method for Efficient Data Updates in Constantly Active RAIDS,” Fourteenth International Conference on Data Engineering, IEEE Computer Society, 1998, pp. 144-153.
S. Savage et al., “AFRAID—A Frequently Redundant Array of Independent Disks,”1996 USENIX Technical Conference, Jan. 22-26, 1996, San Diego, CA., pp. 27-39.
D. Stodolsky et al., “Parity Logging: Overcoming the Small Write Problem in Redundant Disk Arrays,”Proceedings of the Twentieth International Symposium on Computer Architecture, May 1993, pp. 64-75.
Johnson Theodore
Shasha Dennis
AT&T Corp.
Beausoleil Robert
Bonzo Bryce
LandOfFree
Fault-tolerant storage system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Fault-tolerant storage system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fault-tolerant storage system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2458067