Cache management system utilizing cascading tokens

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06691137

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to data storage systems that employ base storage along with a high speed cache. More particularly, the invention concerns a data storage system that assigns tokens to data objects stored in cache or base storage. For each data object, a token database tracks “cascading” tokens that include an “anywhere” token and “base” token. The data storage system uses these cascading tokens to track functions such as grooming the cache, de-staging data from cache to base storage, and processing cache miss events.
2. Description of the Related Art
Many data processing systems require a large amount of data storage, for use in efficiently accessing, modifying, and re-storing data. Data storage is typically separated into several different levels, each level exhibiting a different data access time or data storage cost. A first, or highest level of data storage involves electronic memory, usually dynamic or static random access memory (DRAM or SRAM). Electronic-memories take the form of semiconductor integrated circuits where millions of bytes of data can be stored on each circuit, with access to such bytes of data measured in nanoseconds. The electronic memory provides the fastest access to data since access is entirely electronic.
A second level of data storage usually involves direct access storage devices (DASD). DASD storage, for example, includes magnetic and/or optical disks. Data bits are stored as micrometer-sized magnetically or optically altered spots on a disk surface, representing the “ones” and “zeros” that comprise the binary value of the data bits. Magnetic DASD includes one or more disks that are coated with remnant magnetic material. The disks are rotatably mounted within a protected environment. Each disk is divided into many concentric tracks, or closely spaced circles. The data is stored serially, bit by bit, along each track. An access mechanism, known as a head disk assembly (HDA) typically includes one or more read/write heads, and is provided in each DASD for moving across the tracks to transfer the data to and from the surface of the disks as the disks are rotated past the read/write heads. DASDs can store gigabytes of data, and the access to such data is typically measured in milliseconds (orders of magnitudes slower than electronic memory). Access to data stored on DASD is slower than electronic memory due to the need to physically position the disk and HDA to the desired data storage location.
A third or lower level of data storage includes tapes, tape libraries, and optical disk libraries. Access to library data is much slower than electronic or DASD storage because a robot or human is necessary to select and load the needed data storage medium. An advantage of these storage systems is the reduced cost for very large data storage capabilities, on the order of Terabytes of data. Tape storage is often used for backup purposes. That is, data stored at the higher levels of data storage hierarchy is reproduced for safe keeping on magnetic tape. Access to data stored on tape and/or in a library is presently on the order of seconds.
Data storage, then, can be conducted using different types of storage, where each type exhibits a different data access time or data storage cost. Rather than using one storage type to the exclusion of others, many data storage systems include several different types of storage together, and enjoy the diverse benefits of the various storage types. For example, one popular arrangement employs an inexpensive medium such as tape to store the bulk of data, while using a fast-access storage such as DASD to cache the most frequently or recently used data.
During normal operations, synchronization between cache and tape is not all that important. If a data object is used frequently, it is stored in cache and that copy is used exclusively to satisfy host read requests, regardless of whether the data also resides in tape. Synchronization can be problematic, however, if the cache and tape copies of a data object diverge over time and the data storage system suffers a disaster. In this case, the cache and tape contain different versions of the data object, with one version being current and the other being outdated. But, which is which? In some cases, there may be some confusion as to which version of the data object is current. At worst, a stale or “down-level” version of a data object may be mistaken (and subsequently used) as the current version. Thus, in the event of cache failure, data integrity may be questionable and there is some risk of the data storage system incorrectly executing future host read requests by recalling a stale version of the data.
SUMMARY OF THE INVENTION
Broadly, the present invention concerns a data storage system that employs base storage along with a high speed cache. Whenever a data object is stored in the cache or base storage, it is assigned (and optionally encapsulated with) an anywhere token. The anywhere token contains a code indicating the data object's version. Whenever the data object is stored in base storage, the data object is assigned a base token with the same value as its current anywhere token. Thus, the base token also contains the data object's latest version code at the time the data object is written in base storage. However, the base token is frozen in time because future cache-only updates of the data object will have the effect of changing the anywhere token without affecting the base token. The anywhere/base tokens of each data object constitute cascading tokens. These cascading tokens are available for use by the data storage system to track functions such as grooming the cache, de-staging data to base storage, and processing cache miss events. All tokens are stored in a token database.
In more specfiic terms, the data storage system of this invention includes a controller, cache, base storage, and various organizational data such as a token database, cache directory, base-storage-written list, etc. For each data object, the token database is capable of listing an anywhere token and'a base token. When a data object is received for storage, the controller assigns an anywhere token to the data object. The anywhere token contains the latest metadata for the data object, including at least a version code. Optionally, the controller may encapsulate the data object with the version code and-some or all of the remaining metadata of the data object's anywhere token. The controller proceeds to store the data object in the cache, base storage, or both. The controller also stores the anywhere token in the token database, cross-referenced against the data object. Whenever the data object is written to base storage, the controller updates the token database by copying the anywhere token into the base token field for that data object. Contents of the token database are written out to base storage in pieces of suitable size, such as tokens of individual data objects, parts of the token database, or the entire token database as a whole.
If the storage system experiences a cache failure, normal storage operations are halted until the cache is repaired. Data objects lost from cache can be copied back into cache from base storage. Then, the controller implements a replacement token database. Namely, the controller accesses the token database excerpts in base storage to retrieve the base tokens of all data objects that were lost from cache but still exist in base storage. With these base tokens, the controller populates a replacement token database; namely, these base tokens are used as both anywhere and base tokens for the data objects lost from cache. Then, the replacement token database is used to the exclusion of the previous token database. This avoids any danger of unknowingly recalling down-level data objects from base storage, where newer counterpart data objects had been stored in cache but lost in the cache failure. Also, the cache may be repopulated with the lost data objects in one setting, or as

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Cache management system utilizing cascading tokens does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Cache management system utilizing cascading tokens, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Cache management system utilizing cascading tokens will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3280670

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.