Cache-failure-tolerant data storage system storing data...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S113000

Reexamination Certificate

active

06502108

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to data storage systems that utilize tape or other base storage along with high speed cache. More particularly, the invention concerns a data storage system that stores data objects with encapsulated metadata tokens in cache and/or base storage to protect against recalling stale data from base storage in the event of a cache failure.
2. Description of the Related Art
Many data processing systems require a large amount of data storage, for use in efficiently accessing, modifying, and re-storing data. Data storage is typically separated into several different levels, each level exhibiting a different data access time or data storage cost. A first, or highest level of data storage involves electronic memory, usually dynamic or static random access memory (DRAM or SRAM). Electronic memories take the form of semiconductor integrated circuits where millions of bytes of data can be stored on each circuit, with access to such bytes of data measured in nanoseconds. The electronic memory provides the fastest access to data since access is entirely electronic.
A second level of data storage usually involves direct access storage devices (DASD). DASD storage, for example, includes magnetic and/or optical disks. Data bits are stored as micrometer-sized magnetically or optically altered spots on a disk surface, representing the “ones” and “zeros” that comprise the binary value of the data bits. Magnetic DASD includes one or more disks that are coated with remnant magnetic material. The disks are rotatably mounted within a protected environment. Each disk is divided into many concentric tracks, or closely spaced circles. The data is stored serially, bit by bit, along each track. An access mechanism, known as a head disk assembly (HDA) typically includes one or more read/write heads, and is provided in each DASD for moving across the tracks to transfer the data to and from the surface of the disks as the disks are rotated past the read/write heads. DASDs can store gigabytes of data, and the access to such data is typically measured in milliseconds (orders of magnitudes slower than electronic memory). Access to data stored on DASD is slower than electronic memory due to the need to physically position the disk and HDA to the desired data storage location.
A third or lower level of data storage includes tapes, tape libraries, and optical disk libraries. Access to library data is much slower than electronic or DASD storage because a robot or human is necessary to select and load the needed data storage medium. An advantage of these storage systems is the reduced cost for very large data storage capabilities, on the order of Terabytes of data. Tape storage is often used for backup purposes. That is, data stored at the higher levels of data storage hierarchy is reproduced for safe keeping on magnetic tape. Access to data stored on tape and/or in a library is presently on the order of seconds.
Data storage, then, can be conducted using different types of storage, where each type exhibits a different data access time or data storage cost. Rather than using one storage type to the exclusion of others, many data storage systems include several different types of storage together, and enjoy the diverse benefits of the various storage types. For example, one popular arrangement employs an inexpensive medium such as tape to store the bulk of data, while using a fast-access storage such as DASD to cache the most frequently or recently used data.
During normal operations, synchronization between cache and tape is not all that important. If a data object is used frequently, it is stored in cache and that copy is used exclusively to satisfy host read requests, regardless of whether the data also resides in tape. Synchronization can be problematic, however, if the cache and tape copies of a data object diverge over time and the data storage system suffers a disaster. In this case, the cache and tape contain different versions of the data object, with one version being current and the other being outdated. But, which is which? In some cases, there may be some confusion as to which version of the data object is current. At worst, a stale or “down-level” version of a data object may be mistaken (and subsequently used) as the current version. Thus, in the event of cache failure, data integrity may be questionable and there is some risk of the data storage system incorrectly executing future host read requests by recalling a stale version of the data.
SUMMARY OF THE INVENTION
Broadly, the present invention concerns a cache-equipped data storage system that stores data objects with encapsulated metadata tokens to protect against recalling stale data from base storage in the event of a cache failure. The storage system includes a controller coupled to a cache, base storage, and token database. The controller may be coupled to a hierarchically superior director or host.
When a data object is received for storage, the controller assigns a version code for the data object if the data object is new to the system; if the data object already exists, the controller advances the data object's version code. A “token,” made up of various items of metadata including the version code, is encapsulated for storage with its corresponding data object. The controller then stores the encapsulated token along with its data object and updates the token database to cross-reference the data object with its token. Thus, the token database always lists the most recent version code for each data object in the system.
The data object may be copied from cache to base storage automatically, de-staged from cache to base storage based on lack of frequent or recent use, or according to another desired schedule. Whenever the controller experiences a cache miss, there is danger in blindly retrieving the data object from base storage. In particular, the cache miss may have occurred due to failure of part or all of the cache, and at the time of cache failure the base storage might have contained a down-level version of the data object. The present invention solves this problem by comparing the version code of the data object from base storage to the version code of the data object in the token database. Only if the compared version codes match is the data object read from storage and provided as output. Otherwise, an error message is generated since the data object is stale.
As a further enhancement, the invention may utilize a “split” version code, where the version code has a data subpart and properties subpart. The data subpart is advanced solely to track changes to the data, while the properties subpart is advanced according to changes in attributes of the data object other than the data itself. In this embodiment, when the data object's version code from base storage is examined after a cache miss, the data subpart is reviewed without regard to the properties subpart. This avoids the situation where, although the base storage contains a current version of data, this data object would be regarded as stale because a non-split version code that does not make any data/properties differentiation has been advanced due to a change in the data object's properties not affecting the data itself. Accordingly, with this feature, data objects from base storage are more frequently available to satisfy cache misses.
Accordingly, as discussed above, one embodiment of the invention involves a method of operating a cache-equipped data storage system. In another embodiment, the invention may be implemented to provide an apparatus, such as a data storage system configured as discussed herein. In still another embodiment, the invention may be implemented to provide a signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital data processing apparatus to perform operations for operating a data storage system. Another embodiment concerns logic circuitry having multiple interconnected electrically conductive elements

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Cache-failure-tolerant data storage system storing data... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Cache-failure-tolerant data storage system storing data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Cache-failure-tolerant data storage system storing data... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2974218

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.