Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1997-12-31
2001-05-15
Wiley, David A. (Department: 2155)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C711S114000
Reexamination Certificate
active
06233696
ABSTRACT:
FIELD OF THE INVENTION
This invention relates to data storage for computers, and more particularly to method and apparatus for diagnosing and repairing data stored in a system including redundant information.
SUMMARY OF THE RELATED ART
Relatively early in the development of computer systems, disk drives became a fundamental device for storage. Accordingly, computer operating systems and application programs have been developed assuming that memory will rely on input/output (“I/O”) to a disk drive. The demand for storage has also skyrocketed. As a result a number of separate physical devices may be required to accommodate the total amount of storage required for a system.
The result, described briefly below, is that a number of strategies have developed for placing data onto physical disk drives. Indeed, there are a variety of ways of mapping data onto physical disks, as is generally known in the art.
It would be highly inefficient, however, to have to change the operating system and/or application programs every time a change is made to the physical storage system. As a result, there has been a conceptual separation of the application's view of data storage and the actual physical storage strategy.
FIG. 1
illustrates this concept. The application/operating system's view of the storage system contemplates three separate storage devices—logical volume A
10
, logical volume B
11
, and logical volume C
12
. Thus, as far as the operating system can discern, the system consists of three separate storage devices
10
-
12
. Each separate storage device may be referred to as a “logical volume,” “logical disk,” or“virtual disk.” These names reflect the fact that the application's (or operating system's) logical view of the storage device structure may not correspond to the actual physical storage system implementing the structure.
In
FIG. 1
, the data is physically stored on the physical storage devices
14
-
16
. In this particular example, although there are three physical devices
14
-
16
and three logical volumes
10
-
12
, there is not a one to one mapping of the logical volumes to physical devices. In this particular example, the data in logical volume A
10
is actually stored on physical devices
14
-
16
, as indicated at
10
a
,
10
b
and
10
c
. In this example, logical volume B is stored entirely on physical device
14
, as indicated at
12
a
,
12
b
. Finally, logical volume C is stored on physical device
14
and physical device
16
as indicated at
11
a
,
11
b.
In this particular example, the boxes
10
a
-
10
c
,
11
a
-
11
b
and
12
a
-
12
b
represent contiguous segments of storage within the respective physical devices
14
-
16
. These contiguous segments of storage may, but need not, be of the same size.
Array management software running on a general purpose processor (or some other mechanism such as a custom hardware circuit)
13
translates requests from a host computer (not shown) (made assuming the logical volume structure
10
-
12
) into requests that correspond to the way in which the data is actually stored on the physical devices
14
-
16
. In practice, the array management software
13
may be implemented as a part of a unitary storage system that includes the physical devices
14
-
16
, may be implemented on a host computer, or may be done in some other manner.
The physical storage devices shown in
FIG. 1
are disk drives. Disk drives include one or more disks of a recording media (such as a magnetic recording medium or an optical recording medium). Information can be written and read from the storage medium for storage purposes. The recording medium is typically in the form of a disk that rotates. The disk generally includes a number of tracks on which the information is recorded and from which the information is read. In a disk drive that includes multiple disks, the disks are conventionally stacked so that corresponding tracks of each disk overlie each other. In this case, specification of a single track on which information is stored within the disk drive includes not only specification of an individual track on a disk, but also which of the multiple disks the information is stored on.
Data on each physical device
14
-
16
may be stored according to one or more formats. Similarly, the request for data from the operating system or application program may correspond to one or more such formats. For example, large disk storage systems employed with many IBM mainframe computer systems implement a count, key, data (“CKD”) record format on the disk drives. Similarly, programs on such computers may request and expect to receive data according to the CKD record format. In the CKD format, the record includes at least three parts. The first part is a “count,” which serves to identify the record and indicates the lengths of the (optional) key field and the data portion of the record. The key field is an optional field that may include information about the record. The “data” portion of the record includes the actual user data stored by the record. The term “data” refers to any information, including formatting information of a record. “Actual user data” refers to the data actually desired for use by the host computer, such as the information in the data field of a CKD record.
Disk drives that do not employ a CKD record format typically use a fixed block architecture (“FBA”) format. In an FBA storage system, each track of a disk is divided into a number of blocks, each having the same size.
Of course, it is possible to use an FBA disk drive system to store data formatted according to the CKD record format. In this case, the array management software
13
must perform the necessary translations between the CKD and FBA formats. One mechanism for performing this function is described in U.S. Pat. No. 5,664,144, entitled “System and method for FBA formatted disk mapping and variable length CKD formatted data record retrieval,” issued on Sep. 2, 1997.
In a system including an array of physical disk devices, such as disk devices
14
-
16
of
FIG. 1
, each device typically performs error detection and/or correction for the data stored on the particular physical device. Accordingly, each individual physical disk device detects when it does not have valid data to provide and, where possible, corrects the errors. Even where error correction is permitted for data stored on the physical device, however, a catastrophic failure of the device would result in the irrecoverable loss of data.
Accordingly, storage systems have been designed which include redundant storage capacity. A variety of ways of storing data onto the disks in a manner that would permit recovery have developed. A number of such methods are generally described in the RAIDbook, A Source Book For Disk Array Technology, published by the RAID Advisory Board, St. Peter, Minn. (5th Ed., February, 1996). These systems include “RAID” storage systems. RAID stands for Redundant Array of Independent Disks.
FIG. 2A
illustrates one technique for storing redundant information in a RAID system. Under this technique, a plurality of physical devices
21
-
23
include identical copies of the data. Thus, the data M
1
can be “mirrored” onto a portion
21
a
of physical device
21
, a portion
22
a
of physical device
22
and a portion
23
a
of physical device
23
. In this case, the aggregate portions of the physical disks that store the duplicated data
21
a
,
22
a
and
23
a
may be referred to as a “mirror group.” The number of places in which the data M
1
is mirrored is generally selected depending on the desired level of security against irrecoverable loss of data.
FIG. 2A
shows three physical devices
21
-
23
which appear to be located in close proximity, for example within a single storage system unit. For very sensitive data, however, one or more of the physical devices that hold the mirrored data may be located at a remote facility. “RAID 1” is an example of data redundancy through mirroring of data. In a RAID 1 architecture, a number of different mechanisms may be used for determin
EMC Corporation
Wiley David A.
Wolf Greenfield & Sacks P.C.
LandOfFree
Data verification and repair in redundant storage systems does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data verification and repair in redundant storage systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data verification and repair in redundant storage systems will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2495359