Method and system for recovering lost data

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S036000, C711S114000, C711S156000

Reexamination Certificate

active

06192484

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a disk array system, and more particularly to a disk array system for storing data in a plurality of magnetic disk devices (hard disk drive (HDD)) using a redundancy configuration.
2. Description of the Related Art
In the past, a redundant array of inexpensive disks (RAID) Level 0, which stores data on a single HDD, was the norm. Thereafter, so as to further enhance reliability, RAID Level 1 disk array systems, which store the same data on a plurality of HDD, and RAID Levels 3, 4 and 5 disk array systems, which store data in a distributed fashion on a plurality of HDD, were utilized.
With a RAID Level 0 magnetic disk device, when data stored on an HDD was lost, it was no longer possible to use that data.
With RAID Levels 1, 3, 4 and 5 disk array systems, data is made redundant, so that even if data stored on one of the built-in HDD units is lost, when the disk array system is viewed as a whole, that data can be restored. For this reason, workstations, network servers and other equipment that requires large-capacity external storage systems have come to make use of disk array systems that utilize RAID Level 1, 3, 4 or 5 arrays.
The operation of a dual magnetic disk device, called a RAID Level 1, is explained using FIG.
9
. When a disk array controller
53
receives a write request from a host, the write-requested data
60
is written to both HDD
54
1
and
54
2
. When, as a result of this write operation, it is possible to read the data from both HDD, by comparing HDD
54
1
data against HDD
54
2
data, a highly accurate data read is possible.
And even when it is not possible to read data from one of the HDD, the data can still be obtained be reading it from the other HDD. For example, when it is not possible to read data from HDD
54
1
, the data can be obtained by reading it from HDD
54
2
alone.
FIG. 10
is a schematic depicting the configuration of a disk array system
51
when a redundancy configuration, called RAID Level 4, is used. When a disk array controller
53
receives a write request from a host, the write-requested data
60
is divided into sector units and written in a distributed fashion to storage regions
58
1
-
58
4
of HDD
54
1
-
54
4
.
The disk array controller
53
does not simply distribute the data at this time. That is, it performs an Exclusive OR (XOR) operation on data D
1
, D
2
, D
3
, D
4
stored in corresponding storage regions
58
1
-
58
4
, and writes the result of this operation, parity P, to storage region
58
5
of HDD
54
5
. This XOR operation provides redundancy to the data. Therefore, all of the data is stored in a format that is capable of being reconstructed on the basis of other data parity.
For example, the parity P XOR operation result for data D
4
is the same as those for data D
1
, D
2
, D
3
. Consequently, if it is not possible to read storage region
58
4
, when the data in this HDD storage region and the parity are read out and subjected to an XOR operation, data D
4
can be obtained without reading storage region
58
4
.
Further, with a conventional disk array system, data specifying a failed HDD is stored in volatile memory. Consequently, when the system goes down as a result of a power outage or something, data specifying the failed HDD is lost.
By comparison, with a disk array system, various systems are proposed, whereby data specifying a failed HDD is stored in nonvolatile memory.
For example, in Japanese Patent Publication No. A7-56694, a control method for a system, which uses nonvolatile memory to store the status of a magnetic disk device, is proposed.
FIG. 11
depicts the configuration of this conventional disk array system. The disk array system
51
comprises an interface
52
, a magnetic disk controller (disk array controller)
53
, perhaps 5 magnetic disk devices (HDD)
54
1
-
54
5
, nonvolatile memory
55
1
-
55
5
corresponding to each HDD, and a clock
56
.
The interface
52
inputs data access requests from a host
61
(either read or write requests) to the disk array controller
53
.
The disk array controller
53
comprises means for performing a data read or write operation by controlling the HDD
54
in accordance with the contents of a request output from the host, and means for determining the status of each HDD based on data in nonvolatile memory
55
, and data stored in a storage management data storage region
57
created on the respective disk media. The clock
56
is used to rewrite date and time data in nonvolatile memory
55
storage management data when a failure occurs.
An overview of this prior example is provided by referring to FIGS
12
and
13
. This prior example uses a variable “i”, which specifies in sequence a plurality of HDD, a parameter “N
DISK
”, which specifies a failed HDD when a failed HDD exists, and a parameter “N
ERR
”, in which the count value of the number of failed HDD is stored. Also, storage management data stored in each nonvolatile memory
55
depicted in
FIG. 11
is stored in array A (i), and storage management data stored in the storage management data storage region
57
is stored in array B (i).
Since this prior example employs a redundancy configuration that makes data recovery possible even if an entire disk's worth of data is lost, when the value of “N
ERR
”, in which the count value of the number of failed HDD is stored, is 2 or more, it treats the entire disk array system as abnormal. And, based on the parameter which specifies a failed HDD, this prior example determines whether or not the failed HDD was replaced with a new HDD, and when it determines that the device is a new HDD, it automatically performs recovery processing.
Further, with this prior example, at initialization, the disk array controller
53
stores storage management data containing date/time information in the storage management data storage region
57
of the HDD, and in the nonvolatile memory
55
provided with the pertinent HDD. Also, when an HDD fails, the disk array controller stores the date and time the failure occurred in the nonvolatile memory provided with the failed HDD. Therefore, it is possible to check whether or not the pertinent HDD is the failed disk by comparing the contents of nonvolatile memory
55
with the contents of the storage management data storage region
57
.
Specifically, first, as shown in
FIG. 12
, “0” is set in array variables A(i), B (i), “1” is set in counter i, “0” is set in failed HDD counter N
ERR
, and “0” is set in failed HDD identification parameter N
DISK
(S
201
).
Then, the i
th
nonvolatile memory is tested, and when nonvolatile memory is not normal (N), a determination is made as to the usability of that HDD, N
ERR
is incremented to “1”, i is set in N
DISK
(S
207
) and processing proceeds to S
208
.
When nonvolatile memory is normal (S
202
: Y), the contents of that nonvolatile memory are written to array variable A(i). Next, the i
th
HDD storage management data storage region is tested, and when the storage management data storage region is not normal (S
204
: N), processing proceeds to S
207
. When the storage management data storage region is normal (S
204
: Y), the contents of that storage management data storage region are set in B (i) (S
205
).
When array variables A (i) and B (i) do not match (S
206
: N), since the i
th
HDD is not a normal HDD, N
ERR
is incremented to “1”, and i is set in N
DISK
(S
207
). Next, “1” is added to i (S
208
), and when i is not greater than the number of HDD (S
209
: Y), processing returns to S
202
and the next nonvolatile memory, storage management data storage region is tested. This type operation is repeated until i becomes greater than the total number of HDD.
Next, the number of failed HDD N
ERR
is determined (S
301
). When N
ERR
is “0”, all HDD are normal and the startup operation ends. When N
ERR
is “2” or larger, an error message is output (S
302
), and the restart operation ends. When N
ERR
is “1”, A (N
DISK
) is compared to “0”, and when A (N
DISK
) is “0” (S
303
: Y), nonvolatile memory is det

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for recovering lost data does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for recovering lost data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for recovering lost data will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2608653

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.