Method for managing failed storage media

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method for managing failed storage media Method for managing failed storage media

: 1998-10-06
: 2002-02-26
: Beausoleil, Robert (Department: 2184)
: Error detection/correction and fault detection/recovery
: Data processing system error or fault handling
: Reliability and availability

: C360S031000
: Reexamination Certificate
: active
: 06351825
: ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to a method for handling failed media in devices for managing storage media (generally known as changer devices). More specifically, the present invention relates to a system for managing failed media where a plurality of changer devices is linked to form a disk array system.
As a substitute for hard disk devices (HDD: Hard Disk Drives) used in a standalone manner, the use of disk array systems (RAID: Redundant Arrays of Inexpensive Disks) is being studied. Such systems involve an external storage device that speeds up reads and writes by operating multiple hard disk devices concurrently while improving reliability through redundancy, as described for example, in “A Case for Redundant Arrays of Inexpensive Disks (RAID)” (David A. Patterson, Garth Gibson, and Randy H. Katz, Computer Science Division Department of Electrical Engineering and Computer Sciences, University of California Berkeley). In this paper, different disk array structures are assigned level numbers from 1 to 5. RAID Level 3 can provide improved performance for sequential accesses involving transfer of large blocks of data. RAID Level 5 can provide improved performance for random accesses involving large numbers of small reads and writes.
In the description below, magnetic disk devices that use fixed media are referred to as “hard disk drives” and are distinguished from storage devices that use removable media such as magneto-optical disk drives, optical disk drives, and magnetic disk drives.
The following is a description, referencing FIG.
13
and
FIG. 14
, of RAID control operations. As shown in
FIG. 13
, a disk array device
101
includes a plurality of hard disk drives
111
-
115
. A RAID controller
105
controls the manner in which data is striped to the plurality of hard disk drives. RAID controller
105
also determines how redundant data (parity) is generated and where this redundant data is stored.
The disk array device shown in
FIG. 13
includes five hard disk drives and contains data storage areas D
0
through D
15
, as shown in FIG.
14
. These data storage areas are referred to as “stripes” and can store, for example, 32 KB of data.
Redundant data (parity) storage areas P
0
-
3
, P
4
-
7
, P
8
-
11
, and P
12
-
15
, are also provided. Redundant data storage area P
0
-
3
contains the exclusive OR (XOR) values of the data stored in stripe
0
(D
0
) through stripe
3
(D
3
), calculated as follows:
(P
0
-
3
)=(Data from stripe
0
)
XOR (data from stripe
1
)
XOR (data from stripe
2
)
XOR (data from stripe
3
)
In general, a redundant data storage area Pm-n will hold the exclusive OR (XOR) values of the data stored in stripe m (Dm) through stripe n (Dn). This redundant data will be used if one of the hard disk drives in the disk array device fails. A description of the read and write operations performed when one of the hard disk drives has failed, a state referred to as “degraded mode”, will be presented later.
Stripe n through stripe m in the data storage area and redundant storage area Pm-n are together referred to as a stripe group. The disk array device is seen by the host computer as a single logical storage device wherein stripe
0
through stripe
15
are continuous. Physically, however, the data is stored as stripes on a plurality of hard disk drives.
First, the operations involved when a host computer
100
reads data from a disk array device
101
will be described. For example, to read data stored in stripe
4
and stripe
5
, host computer
100
issues a read request for this data to disk array device
101
. Disk array device
101
reads stripe
4
on hard disk drive
111
and stripe
5
on hard disk drive
112
and transfers the data to host computer
100
.
Next, the operations involved when host computer
100
writes data to disk array device
101
will be described. Since write operations performed by disk array device
101
involve redundant data, the methods used are different from those that would be used with a standalone hard disk drive. When writing data, redundant data needs to be updated as well. There are two methods for updating the redundant data.
For example, in writing data to stripe
6
, one method involves reading data from stripe
4
, stripe
5
, and stripe
7
, which all belong to the same stripe group as stripe
6
. Then the following operations are performed to determine the redundant data:
(P
4
-
7
)=(data from stripe
6
)
XOR (data from stripe
4
)
XOR (data from stripe
5
)
XOR (data from stripe
7
)
Then, the data is written to stripe
6
and the redundant data is written to parity storage area P
4
-
7
.
In a second method for writing data to stripe
6
, the data contained in stripe
6
before the write operation (referred to as the “old stripe
6
data”) and the data contained in parity storage area P
4
-
7
before the write operation (referred to as “the old parity storage area P
4
-
7
data”), are read. The stripe
6
data to be written (the new stripe
6
data) is used to calculate redundant data using the following operations:
(P
4
-
7
)=(data from stripe
6
)
XOR (old stripe
6
data)
XOR (old P
4
-
7
data)
Next, the data is written to stripe
6
and the redundant data is written to P
4
-
7
.
Disk array devices require a greater number of accesses to hard disk drives since it is necessary to write and update redundant data in addition to the main data.
The following is a description of read operations performed in degraded mode by a disk array device that uses parity for redundant data. In this example, hard disk drive
111
in
FIG. 14
becomes inaccessible due to failure and data stored in stripe
4
and stripe
5
must be read. The disk array device first tries to read the first half of the data from stripe
4
but cannot perform the read operation due to the failure. In this case, disk array device
101
reads redundant data P
4
-
7
and the data on stripe
5
, stripe
6
, and stripe
7
. The following operations are then performed to recover the data on stripe
4
:
(Data from stripe
4
)=(P
4
-
7
)
XOR (data from stripe
5
)
XOR (data from stripe
6
)
XOR (data from stripe
7
)
The recovered data from stripe
4
and the data read from stripe
5
are transferred to the host computer. Thus, data can be read even if one of the hard disk drives has failed.
Of course, if the disk array device knows beforehand that hard disk drive
111
has failed, the read operation for stripe
4
can be omitted.
To be able to reconstruct data correctly in degraded mode, the correct redundant data (parity) must be generated. There are two methods for keeping the data and the redundant data consistent (parity generation).
In the first method, the following operations are performed for each of the stripe groups:
(Pmn)=(data from stripe m)
[ . . . ]
XOR (data from stripe n)
In the second method, zeros are written to each of the hard disk drives that are in the disk array. Compared to the first method, which requires calculations to be performed, the second method simply writes predetermined data to the hard disk drive, thus simplifying the operation. As long as parity, as described above, is maintained, non-zero data can be used.
When a hard disk drive has failed, a new hard disk drive must be connected in place of the failed hard disk drive to restore normal operations. If the system is equipped with a spare hard disk drive, this spare drive can be used.
To restore data, all the data that had been in the failed hard disk drive
111
must be reconstructed on the newly connected (or spare) hard disk drive. RAID controller
105
restores data for stripe
0
, stripe
4
, stripe
8
, and stripe
12
, in that order. To restore the data for stripe
0
, the following operations are performed:
(Data for stripe
0
)=(P
0
-
3
)
XOR (data from stripe
1
)
XOR (data from stripe
2
)
XOR (data from stripe
3
)
The restored stripe
0
data is then written to the newly connected hard disk drive, thus recovering the data from stripe
0
. The data from stripe
4
, stripe
8
,

Affiliated with

Kaneda Yasunori

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Oeda Takashi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Teraoka Tadahiro

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Beausoleil Robert

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Bonzo Bryce P.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Hitachi , Ltd.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mattingly, Stanger & Malur,P.C.

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for managing failed storage media does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for managing failed storage media, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for managing failed storage media will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2968263

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure