Electrical computers and digital processing systems: memory – Storage accessing and control – Control technique
Reexamination Certificate
2000-03-31
2003-12-02
Bragdon, Reginald G. (Department: 2188)
Electrical computers and digital processing systems: memory
Storage accessing and control
Control technique
C711S161000, C707S793000
Reexamination Certificate
active
06658540
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to data consistency in data storage systems, and more specifically, to a method for pipelining a number of write commands between a sending site and a receiving site while providing command ordering during controller-based synchronous or asynchronous copy operations in a remote data replication system using a Storage Area Network.
BACKGROUND OF THE INVENTION AND PROBLEM
It is desirable to provide the ability for rapid recovery of user data from a disaster or significant error event at a data processing facility. This type of capability is often termed ‘disaster tolerance’. In a data storage environment, disaster tolerance requirements include providing for replicated data and redundant storage to support recovery after the event. In order to provide a safe physical distance between the original data and the data to backed up, the data must be migrated from one storage subsystem or physical site to another subsystem or site. It is also desirable for user applications to continue to run while data replication proceeds in the background. Data warehousing, ‘continuous computing’, and Enterprise Applications all require remote copy capabilities.
Storage controllers are commonly utilized in computer systems to off-load from the host computer certain lower level processing functions relating to I/O operations, and to serve as interface between the host computer and the physical storage media. Given the critical role played by the storage controller with respect to computer system I/O performance, it is desirable to minimize the potential for interrupted I/O service due to storage controller malfunction. Thus, prior workers in the art have developed various system design approaches in an attempt to achieve some degree of fault tolerance in the storage control function. One such prior approach requires that all system functions be “mirrored”. While this type of approach is most effective in reducing interruption of I/O operations and lends itself to value-added fault isolation techniques, it has previously been costly to implement and heretofore has placed a heavy processing burden on the host computer.
One prior method of providing storage system fault tolerance accomplishes failover through the use of two controllers coupled in an active/passive configuration. During failover, the passive controller takes over for the active (failing) controller. A drawback to this type of dual configuration is that it cannot support load balancing, as only one controller is active and thus utilized at any given time, to increase overall system performance. Furthermore, the passive controller presents an inefficient use of system resources.
Another approach to storage controller fault tolerance is based on a process called ‘failover’. Failover is known in the art as a process by which a first storage controller, coupled to a second controller, assumes the responsibilities of the second controller when the second controller fails. ‘Failback’ is the reverse operation, wherein the second controller, having been either repaired or replaced, recovers control over its originally-attached storage devices. Since each controller is capable of accessing the storage devices attached to the other controller as a result of the failover, there is no need to store and maintain a duplicate copy of the data, i.e., one set stored on the first controller's attached devices and a second (redundant) copy on the second controller's devices.
U.S. Pat. No. 5,274,645 (Dec. 28, 1993), to Idleman et al. discloses a dual-active configuration of storage controllers capable of performing failover without the direct involvement of the host. However, the direction taken by Idleman requires a multi-level storage controller implementation. Each controller in the dual-redundant pair includes a two-level hierarchy of controllers. When the first level or host-interface controller of the first controller detects the failure of the second level or device interface controller of the second controller, it re-configures the data path such that the data is directed to the functioning second level controller of the second controller. In conjunction, a switching circuit re-configures the controller-device interconnections, thereby permitting the host to access the storage devices originally connected to the failed second level controller through the operating second level controller of the second controller. Thus, the presence of the first level controllers serves to isolate the host computer from the failover operation, but this isolation is obtained at added controller cost and complexity.
Other known failover techniques are based on proprietary buses. These techniques utilize existing host interconnect “hand-shaking” protocols, whereby the host and controller act in cooperative effort to effect a failover operation. Unfortunately, the “hooks” for this and other types of host-assisted failover mechanisms are not compatible with more recently developed, industry-standard interconnection protocols, such as SCSI, which were not developed with failover capability in mind. Consequently, support for dual-active failover in these proprietary bus techniques must be built into the host firmware via the host device drivers. Because SCSI, for example, is a popular industry standard interconnect, and there is a commercial need to support platforms not using proprietary buses, compatibility with industry standards such as SCSI is essential. Therefore, a vendor-unique device driver in the host is not a desirable option.
However, none of the above references disclose a disaster tolerant data storage system having a remote backup site connected to a host site via a dual fabric link, where the system replication and error recovery functions are controller-based. Furthermore, none of the above systems allows a number of write commands to be ‘pipelined’ (in transit and unacknowledged) between local and remote sites while ensuring the proper ordering of commands on remote media during synchronous or asynchronous operation. In addition, the prior technology fails to provide a mechanism for ‘tuning’ of links based on distance and performance requirements.
Therefore, there is a clearly felt need in the art for a disaster tolerant data replication system capable of optimally tunable inter-site performance, and which allows commands to be pipelined during operation, where the data replication functions are performed without the direct involvement of the host computer.
SOLUTION TO THE PROBLEM
Accordingly, the above problems are solved, and an advance in the field is accomplished by the system of the present invention which provides a completely redundant configuration including dual Fibre Channel fabric links interconnecting each of the components of two data storage sites, wherein each site comprises a host computer and associated data storage array, with redundant array controllers and adapters. The present system is unique in that each array controller is capable of performing all of the data replication functions including the handling of failover functions.
In the situation wherein an array controller fails during an asynchronous copy operation, the partner array controller uses a ‘micro log’ stored in mirrored cache memory to recover transactions which were ‘missed’ by the backup storage array when the array controller failure occurred. The present system provides rapid and accurate recovery of backup data at the remote site by sending all logged commands and data from the logging site over the link to the backup site in order, while avoiding the overhead of a full copy operation.
An important aspect of the present invention is the concept of a ‘look-ahead limit’, which, as implemented, allows a large number of ‘outstanding’ commands to be concurrently be in transit between sites at any given time, while guaranteeing in-order operation of the data replication function. In addition, the present system automatically calculates an average transit/response time for each specific link, which is
Elkington Susan G.
Lary Richard F.
Sicola Stephen J.
Walker Michael D.
Bragdon Reginald G.
Hewlett--Packard Development Company, L.P.
Inoa Midys
LandOfFree
Method for transaction command ordering in a remote data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for transaction command ordering in a remote data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for transaction command ordering in a remote data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3159470