Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
1999-08-31
2002-06-18
Beausoleil, Robert (Department: 2785)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S042000, C714S054000
Reexamination Certificate
active
06408406
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to hard disk drives. More particularly, the present invention relates to a functional test for determining whether a hard disk drive has experienced an early-life failure, or may fail in the near future.
2. Description of the Related Art
Hard disk drives store large volumes of data on one or more disks mounted on a spindle assembly. Disk drives employ a disk control system for interfacing with a host (e.g., a computer) to control the reading and writing of data on a disk. Each disk includes at least one disk surface which is capable of storing data. On each disk surface, user data is stored in concentric circular tracks between an outside diameter and an inside diameter of the disk.
As a result of the manufacturing process, defective data sites may exist on the disk surfaces of the disk drive. These defective data sites are termed “prior defects”. A defect discovery procedure is performed to locate these defects and mark them out as defective locations on the disk surface which are not available for use. A typical defect discovery procedure includes writing a known data pattern to the disk surface and subsequently reading the data pattern from the disk surface. Defective data sites are identified by comparing the data pattern read from the disk surface with the known data pattern written to the disk surface.
Following the defect discovery procedure, defective data sites are put in a prior defect list which is stored in a table. The prior defect list is used during formatting of the disk surface to generate a defect management table. Within the defect management table, the defective data sites may be mapped to data sector locations (cylinder number, head number, and data sector number). Once identified in the defect management table, the defective data sectors may not be used for storing data.
Defective data sites encountered after formatting the disk surface are known as “grown defects”. Grown defects often occur in locations adjacent to defective data sites found during defect discovery. Grown defects are also listed in a table, similar to that utilized by the “prior defects”. The number of sites marked out on a disk drive as “defective data sites” is used as a measure of the quality of the disk drive. Upon interrogation by a host, the disk drive will report the defect list generated in the defect management table.
Defects such as “prior defects” and “grown defects” are known as hard sector errors. A hard sector error is essentially permanent in nature, thus the sector cannot be recovered. A disk may also contain transient or “soft” errors. A transient error is defined as an error or defect which clears over a period of time. For example, a transient error may occur due to a thermal asperity on the disk surface. A retry mode may be entered, wherein the command is retried a number of times allowing sufficient time to pass for the transient error to clear. Transient errors are also logged on the drive as they occur.
A common problem encountered by disk drive manufacturers is the improper diagnosis of disk drive failures in customer systems. In many instances, functional disk drives improperly diagnosed as defective by customers are unnecessarily returned to the manufacturer, resulting in down time for the customer as well as extra expense to the manufacturer to diagnose the disk. The problem of improper diagnosis of disk failures is particularly acute in drives that are relatively new (e.g., fewer than 600 power-on hours).
Test suites presently exist for testing the condition of a disk drive. These test suites exhaustively test all locations on the surface of the disk for failures. Unfortunately, these test suites require extensive run time (often 30 minutes or longer), and often require special expertise to activate proprietary modes within the disk drive. Therefore, such test suites are rarely used by end customers to diagnose drive problems. Additionally, since such test suites read, write, and verify essentially all storage sites on the disk, such tests will become even slower as disk capacities increase in the future. Finally, test suites may affect customer data stored on the disk drive.
It is desirable to have a functional test for customers to quickly and easily diagnose early-life disk drives in customer systems when a disk failure is suspected, in order to prevent the customer return of properly functioning disk drives. The time required for performing the functional test should be independent of the capacity of the hard disk drive. The functional test should utilize both historical performance parameters (such as “hard” and “soft” errors and “prior” and “grown” defects) continually logged on the disk, and active read/write/verify operations to the most susceptible/critical data sites on the disk in order to determine the operating condition of the disk. Finally, the functional test should not disturb customer data while testing the surface of the disk.
SUMMARY OF THE INVENTION
The present invention provides a method of functionally testing a potentially defective disk drive having data sites on a disk for recording data thereon. During the operation of the disk drive, the disk drive stores a plurality of historical performance parameters for continuously logging operational problems.
The method begins by performing an analysis of the stored historical performance parameters. A set of performance thresholds associated with each of the plurality of stored historical performance parameters is defined. Next the stored plurality of historical performance parameters is retrieved, and each of the plurality of historical performance parameters is compared against its associated performance threshold. If the value of the historical performance parameter exceeds the associated performance threshold, the disk drive is marked as a failed disk drive.
If none of the performance thresholds are exceeded, the method next performs a set of non-destructive read/write tests to selected regions of the disk. A set of performance thresholds associated with each of the set of non-destructive read/write tests is defined. Next, the set of non-destructive read/write tests is run, generating a set of results. The results of each of the non-destructive read/write tests is then compared against the associated performance threshold. If the results of the non-destructive read/write tests exceed: the associated performance threshold, the disk drive is marked as a failed disk drive.
In one embodiment of the present invention, the functional testing method retrieves a power-on time parameter value from the disk drive, compares the power-on time parameter value against a user defined threshold value and if the power-on time parameter value exceeds the user defined threshold value, the functional test is terminated. The power-on time parameter value is set to 600 hours in a preferred embodiment of the present invention.
The functional testing method of the present invention issues commands to the disk drive which are compliant with SCSI-
3
specifications. The plurality of historical parameters used by the present invention include: counts of soft error rates and reassignments, counts of corrected and uncorrected errors encountered during read, write, and :verify operations to the disk drive, and the number of entries found in the grown defect list (GLIST).
The set of non-destructive read/write tests include: a read/write test of a known pattern to all disk drive heads in a non-customer data area, a verification test of the first 100 megabytes of data on the disk drive, and a series of random inner diameter zone region/outer diameter zone region read and seek tests. The series of random inner diameter zone region/outer diameter zone region read and seek test includes a random read operation to an inner diameter zone. region logical block address (LBA), followed by a random read operation of an outer diameter zone region LBA, followed by a seek to a random inner diameter zone region LBA, followed by a seek to a random ou
Duncan Marc
Shara, Esq. Milad G
Western Digital Technologies Inc.
LandOfFree
Hard disk drive infant mortality test does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Hard disk drive infant mortality test, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Hard disk drive infant mortality test will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2981346