Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-09-29
2004-02-17
Baderman, Scott (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S006130, C714S015000, C707S793000
Reexamination Certificate
active
06694447
ABSTRACT:
FIELD OF THE INVENTION
The invention relates generally to the field of remote storage replication. More particularly, the invention relates to a method and apparatus for increasing application availability during a disaster fail-back.
BACKGROUND OF THE INVENTION
Application downtime following a server failure represents an increasing problem due to the widespread use of computerized applications as well as the ever-expanding electronic commerce-driven economy. To increase data availability and reduce application down time, customers typically build a disaster recovery site which can take over during a disaster or server failure. A significant amount of time and planning goes into insuring that following a disaster, fail-over occurs as rapidly as possible. To this end, many vendors provide methods to reduce this downtime.
Remote storage replication is a technique which is primarily used for disaster protection. The processes are optimized by vendors to expedite the fail-over process from a primary site to a secondary or disaster recovery site. A problem that is less frequently looked at, however, is the time to return the operations to the primary site once the problems causing the fail-over have been resolved.
Customer Service Level Agreements (SLAs) define the level of service provided to customers and typically state the amount of downtime acceptable when a disaster strikes. However, customer SLAs rarely provide the requirements for returning operations to the original site once the problem has been resolved. With this in mind, most companies offer solutions that minimize the fail-over process but pay less attention to the requirements of retrieving the state of the primary site (fail-back). In addition, conventional fail-back processes are rarely, if ever, tested under real, live conditions. During such customer testing, the fail-over process is likely to be well-documented. Scripts are generally written and seem to work as efficiently as possible. However, on conclusion of the fail-over process testing, operations continue on the primary site without having had the need to restore the data replicated to the secondary site.
Conventional fail-back processes are lengthy and often require the entire population of the data from the secondary site back to the primary site. These processes, if ever required, can be time consuming and involve a fair amount of application unavailability or down time. Conventional fail-back processes often require complete resynchronization of the data to get back the data to the primary server and depending on the size of the data, this can be a very long process, adding to the amount of application downtime.
SUMMARY OF THE INVENTION
The present invention provides a method and apparatus for increasing availability of an application during fail-back from a secondary site to a primary site following a failure at the primary site. The method includes copying data from active storage volumes to secondary storage volumes on the secondary site while the application runs on the secondary site and updates the active storage volumes. Once the secondary storage volumes of the secondary site are updated, the data is re-synchronized from the secondary storage volumes on the secondary site to the primary storage volumes of the primary site. The steps of copying the data and resynchronizing the data are repeated for data updated by the application, during the re-synchronization, until a time required to complete the resynchronization step for the updated data is within an acceptable downtime for the application. Once this step is complete, the application is failed-back to the primary site by bringing up the application at the primary site. Application availability is therefore increased by limiting the application downtime to an acceptable down time.
REFERENCES:
patent: 5555371 (1996-09-01), Duyanovich et al.
patent: 5592618 (1997-01-01), Micka et al.
patent: 6078932 (2000-06-01), Haye et al.
patent: 6212531 (2001-04-01), Blea et al.
patent: 6223304 (2001-04-01), Kling et al.
patent: 6292905 (2001-09-01), Wallach et al.
patent: 6363497 (2002-03-01), Chrabaszcz
patent: 6408310 (2002-06-01), Hart
patent: 6516394 (2003-02-01), Don et al.
patent: 6549921 (2003-04-01), Ofek
Crane Philip J.
Leach Judith G.
Baderman Scott
Blakely & Sokoloff, Taylor & Zafman
Lohn Joshua
Sun Microsystems Inc.
LandOfFree
Apparatus and method for increasing application availability... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method for increasing application availability..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for increasing application availability... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3337921