Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2002-01-07
2004-02-03
Sparks, Donald (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S156000, C714S048000, C714S718000, C714S746000, C714S799000
Reexamination Certificate
active
06687791
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to error detection in storage systems.
2. Description of the Related Art
Many storage arrays provide protection against data loss by storing redundant data. Such redundant data may include parity information (e.g., in systems using striping) or additional copies of data (e.g., in systems providing mirroring). A storage system's ability to reconstruct lost data may depend on how many failures occur before the attempted reconstruction. For example, some RAID (Redundant Array of Independent/Inexpensive Disks) systems may only be able to tolerate a single disk failure or error. Once a single disk fails or loses data through an error, such systems are said to be operating in a degraded mode because if additional disks fail before the lost data on the failed or erroneous disk has been reconstructed, it may no longer be possible to reconstruct the lost data. The longer a storage array operates in a degraded mode, the more likely it is that an additional failure will occur. As a result, it is desirable to detect and repair disk failures or other anomalies so that a storage array is not operating in a degraded mode.
Errors that may cause a storage system to operate in a degraded mode include transmission errors, total disk failures, and disk errors. Transmission and disk errors may cause less data vulnerability or data loss than failures, but they may be more difficult to detect. For example, disk drives may occasionally corrupt data, and this corruption may not be detected by the storage system until the data is read from the disk. The corruptions may occur for various different reasons. For example, bugs in a disk drive controller's firmware may cause bits in a sector to be modified or may cause blocks to be written to the wrong address. Such bugs may cause storage drives to write the wrong data, to write the correct data to the wrong place, or to not write any data at all. Another source of errors may be a drive's write cache. Many disk drives use write caches to quickly accept write requests so that the host or array controller can continue with other commands. The data is later copied from the write cache to the disk media. However, write cache errors may cause some acknowledged writes to never reach the disk media. The end result of such bugs or errors is that the data at a given block may be corrupted or stale. Errors such as drive errors and transmission errors may be “silent” in the sense that no error messages are generated when such errors occur.
In general, it is desirable to detect errors soon after they occur so that a storage system is not operating in a degraded mode for an extended time. However, error detection mechanisms are often expensive to implement (e.g., if they require a user to purchase additional or more expensive hardware and/or software) and/or have a detrimental impact on storage system performance. Thus, it is desirable to allow users to select whether to purchase the error detection mechanism independently of the overall system and/or to allow users to be able to independently enable and disable the error detection mechanism.
SUMMARY
Various embodiments of a method and system for sharing a cache are disclosed. In one embodiment, a processing device includes a shared cache, a plurality of processors that are each coupled to the shared cache and each configured to store a result in the shared cache. The processors generate their results by performing the same data integrity operation (e.g., a parity calculation) on the same data. The shared cache may be included on a same semiconductor substrate as a first processor. Because the results are stored in the shared cache, the first processor may quickly access and operate on the results. In one embodiment, the first processor may perform a comparison operation or voting operation on the results stored in the shared cache.
In one embodiment, the shared cache may be multi-ported and each of the shared cache's ports may correspond to a respective one of the processors. Each processor may have a dedicated connection between itself and a respective one of the shared cache's ports. In other embodiments, the processors may be coupled to the shared cache by a bus.
In some embodiments, the shared cache may be the first processor's L1 (level 1) cache. The plurality of processors may be integrated onto the same semiconductor substrate as the first processor. In some embodiments, the first processor may not be included in the plurality of processors that are each storing a result in the shared cache.
In several embodiments, each of the plurality of processors may include its own cache, and each of the plurality of processors may be configured to operate on data and instructions stored in its own cache in order to generate the result. In an alternative embodiment, each of the plurality of processors may be configured to operate on data and instructions stored in the shared cache in order to generate the result. In one embodiment, each of the plurality of processors may only be able to access the shared cache when in a first mode (e.g., a data integrity mode)
In one embodiment, the processing device may be included in a data processing system that includes a host system, and interconnect, and a storage array.
In one embodiment, a method of sharing a cache between multiple processors involves a plurality of processors each performing the same data integrity operation on the same data to generate a result, the plurality of processors storing their results in the shared cache, and the first processor accessing the results in the shared cache.
In one embodiment, a processing device may include a plurality of means for processing data (e.g., processors such as those shown in
FIGS. 10-12
) and means for storing data (e.g., a shared cache like those shown in FIGS.
10
-
12
). The means for storing data may be integrated on the same semiconductor substrate as at least one of the means for processing data. Each of the means for processing data is coupled to the means for storing data and configured to store a result in the means for storing data. Each of the means for processing data may generate its result by performing the same data integrity operation on the same data as each of the other means for processing data.
REFERENCES:
patent: 5153881 (1992-10-01), Bruckert et al.
patent: 5473770 (1995-12-01), Vrba
patent: 5581734 (1996-12-01), DiBrino et al.
patent: 5588012 (1996-12-01), Oizumi
patent: 5828578 (1998-10-01), Blomgren
patent: 6023780 (2000-02-01), Iwatani
patent: 6101589 (2000-08-01), Fuhrmann et al.
patent: 6141770 (2000-10-01), Fuchs et al.
patent: 6233702 (2001-05-01), Horst et al.
patent: 6247118 (2001-06-01), Zumkehr et al.
patent: 6351838 (2002-02-01), Amelia
patent: 2001/0025338 (2001-09-01), Zumkehr et al.
patent: 2003/0048276 (2003-03-01), Wasserman et al.
patent: 2003/0070126 (2003-04-01), Werner et al.
Chace Christian
Kowert Robert C.
Meyertons Hood Kivlin Kowert & Goetzel P.C.
Sparks Donald
Sun Microsystems Inc.
LandOfFree
Shared cache for data integrity operations does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Shared cache for data integrity operations, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Shared cache for data integrity operations will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3335141