System and method for data protection with multidimensional...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S114000

Reexamination Certificate

active

06826711

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates, in general, to network data storage, and, more particularly, to software, systems and methods for high availability, high reliability data storage using parity data protection having an arbitrary dimensionality.
2. Relevant Background
Economic, political, and social power are increasingly managed by data. Transactions and wealth are represented by data. Political power is analyzed and modified based on data. Human interactions and relationships are defined by data exchanges. Hence, the efficient distribution, storage, and management of data is expected to play an increasingly vital role in human society.
The quantity of data that must be managed, in the form of computer programs, databases, files, and the like, increases exponentially. As computer processing power increases, operating system and application software becomes larger. Moreover, the desire to access larger data sets such as those comprising multimedia files and large databases further increases the quantity of data that is managed. This increasingly large data load must be transported between computing devices and stored in an accessible fashion. The exponential growth rate of data is expected to outpace improvements in communication bandwidth and storage capacity, making the need to handle data management tasks using conventional methods even more urgent.
High reliability and high availability are increasingly important characteristics of data storage systems as data users become increasingly intolerant of lost, damaged, and unavailable data. Data storage mechanisms ranging from volatile random access memory (RAM), non-volatile RAM, to magnetic hard disk and tape storage, as well as others, are subject to component failure. Moreover, the communication systems that link users to the storage mechanisms are subject to failure, making the data stored behind the systems temporarily or permanently unavailable. Varying levels of reliability and availability are achieved by techniques generally referred to as “parity”.
Parity storage, as used herein, refers to a variety of techniques that are utilized to store redundant information, error correcting code (ECC), and/or actual parity information (collectively referred to as “parity information”) in addition to primary data (i.e., the data set to be protected). The parity information is used to access or reconstruct primary data when the storage devices in which the primary data is held fail or become unavailable.
Parity may be implemented within single storage devices, such as a hard disk, to allow recovery of data in the event a portion of the device fails. For example, when a sector of a hard disk fails, parity enables the information stored in the failed sector to be recreated and stored at a non-failed sector. Some RAM implementations use ECC to correct memory contents as they are written and read from memory.
Redundant array of independent disks (RAID) technology has developed in recent years as a means for improving storage reliability and availability. The concept, as initially conceived, contemplated the clustering of small inexpensive hard disks into an array such that the array would appear to the system as a single large disk. Simple arrays, however, actually reduced the reliability of the system to that of the weakest member. In response, a variety of methods (i.e., RAID technology) for storing data throughout the array in manners that provided of redundancy and/or parity were developed to provide varying levels of data protection.
Conventional RAID (redundant array of independent disks) systems provide a way to store the same data in different places (thus, redundantly) on multiple storage devices such as hard disk drives. By placing data on multiple disks, input/output (I/O) operations can overlap in a balanced way, distributing the load across disks in the array and thereby improving performance. Since using multiple disks in this manner increases the mean time between failure (MTBF) for the system as a whole with respect to data availability, storing data redundantly also increases fault-tolerance. A RAID system relies on a hardware or software controller to hide the complexities of the actual data management so that RAID systems appear to an operating system to be a single logical volume. However, RAID systems are difficult to scale because of physical limitations on the cabling and controllers. Also, RAID systems are highly dependent on the controllers so that when a controller fails, the data stored behind the controller becomes unavailable. Moreover, RAID systems require specialized, rather than commodity hardware, and so tend to be expensive solutions.
RAID solutions are also relatively expensive to maintain, as well as difficult and time consuming to properly configure. RAID systems are designed to enable recreation of data on a failed disk or controller but the failed disk must be replaced to restore high availability and high reliability functionality. Until replacement occurs, the system is vulnerable to additional device failures. Condition of the system hardware must be continually monitored and maintenance performed as needed to maintain functionality. Hence, RAID systems must be physically situated so that they are accessible to trained technicians who can perform required maintenance. Not only are the man-hours required to configure and maintain a RAID system expensive, but since most data losses are due to human error, the requirement for continual human monitoring and intervention decreases the overall reliability of such a system. This limitation also makes it difficult to set up a RAID system at a remote location or in a foreign country where suitable technicians would have to be found and/or transported to the locale in which the RAID equipment is installed to perform maintenance functions.
RAID systems (levels
0
-
5
) cannot be expanded in minimal increments (e.g. adding a single storage element) while the system is in operation. The addition of a storage element requires that the entire system be brought down, parity recalculated, and then data restored. Hence, expanding the capacity addressed by RAID systems may result in data unavailability for indefinite amounts of time.
Moreover, RAID systems cannot scope levels of parity protection differently for arbitrarily small subsets of data within the overall data set protected. A RAID controller is configured to provide one type of parity protection at a time on a fixed, known set of storage devices. However, different types of data have very different and highly varied protection requirements. Mission critical data may need an extremely high level of protection, whereas data such as program files and seldom used documents may need little or no protection at all. Currently, users must either implement multiple systems to provide varying levels of protection to different types of data, or compromise their data protection needs by either paying too much to protect non-critical data, or by providing less than desired protection for critical data.
Current RAID systems do not provide a practical method by which parity data can be used not only to reconstruct primary data but also to serve data requests in lieu of or in addition to serving those requests directly from the primary data itself. With the exception of mirrored data protection systems, parity information is generally used in the event of a catastrophe to serve requests for lost data only while the primary data is being reconstructed from this parity information. After reconstruction of the primary data, data is once again served from the reconstructed primary only, not the parity information. This increases the effective overhead cost of parity data, as parity information is only passively stored by the storage system rather than actively being used to improve performance during normal operation.
NAS (network-attached storage) refers to hard disk storage that is set up with its own network address rather than being attached to an application server.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for data protection with multidimensional... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for data protection with multidimensional..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for data protection with multidimensional... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3353308

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.