Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-09-21
2004-11-23
Amsbury, Wayne (Department: 2171)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06823349
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to computer storage systems, and more particularly to remote mirroring in distributed computer storage systems.
2. Description of the Background
In a common computer system architecture, a host computer is coupled to a network that includes storage devices which provide non-volatile storage for the host computer. This is typically known as a computer storage system. The computer storage system includes, among other things, a number of interconnected storage units, each storage unit includes a number of physical or logical storage media (for example, a disk array). For convenience, a group of one or more physical disks that are logically connected to form a single virtual disk is referred to hereinafter as a “Logical Unit” (LU). Data from the host computer is stored in the computer storage system, and specifically in the various storage units within the computer storage system.
One problem in a computer storage system is data loss or unavailability, for example, caused by maintenance, repair, or outright failure of one or more, storage units. In order to prevent such data loss or unavailability, a copy of the host data is often stored in multiple storage units that are operated at physically separate storage units. For convenience, the practice of storing multiple copies of the host data in physically separate storage units is referred to as “remote mirroring.” Remote mirroring permits the host data to be readily retrieved from one of the storage units when the host data at another storage unit is unavailable or destroyed.
Therefore, in order to reduce the possibility of data loss or unavailability in a computer storage system, a “remote mirror” (or simply a “mirror”) is established to manage multiple images. Each image consists of one or more LUs, which are referred to hereinafter collectively as a “LU Array Set.” It should be noted that the computer storage system may maintain multiple mirrors simultaneously, where each mirror manages a different set of images.
Within a particular mirror, one image on one storage system is designated as a primary image, while each other image on one storage system within the mirror is designated as a secondary image. For convenience, the storage unit that maintains the primary image is referred to hereinafter as the “primary storage unit,” while a storage unit that maintains a secondary image is referred to hereinafter as a “secondary storage unit.” It should be noted that a storage unit that supports multiple mirrors may operate as the primary storage unit for one mirror and the secondary storage unit for another mirror.
A mirror must provide data availability such that the host data can be readily retrieved from one of the secondary storage units when the host data at the primary storage unit is unavailable or destroyed. In order to do so, it is imperative that all of the secondary images be synchronized with the primary image such that all of the secondary images contain the same information as the primary image. Synchronization of the secondary images is coordinated by the primary storage unit.
Under normal operating conditions, the host, i.e., a server running an operating system and an assortment of programs, writes host data to the primary storage unit. The primary storage unit stores the host data in the primary image and also coordinates all data storage operations for writing a copy of the host data to each secondary storage unit in the mirror and verifying that each secondary storage unit receives and stores the host data in its secondary image.
Today data storage operations for writing the copy of the host data to each secondary storage unit in the mirror can be handled in either a synchronous manner or an asynchronous manner. In conventional synchronous remote mirroring, the primary storage unit ensures that the host data has been successfully written to all secondary storage units in the mirror before sending an acknowledgment to the host, which results in relatively high latency, but ensures that all secondary storage units are updated before informing the host that the write operation is complete. In asynchronous remote mirroring, the primary storage unit sends an acknowledgment message to the host before ensuring that the host data has been successfully written to all secondary storage units in the mirror, which results in relatively low latency, but does not ensure that all secondary storage units are updated before informing the host that the write operation is complete.
In both synchronous and asynchronous remote mirroring, it is possible for a number of failures to occur between receiving a write request from the host and updating the primary image and all of the secondary images. One such failure may involve writing to the primary storage unit, but being unable to write to the secondary storage unit due to an actual hardware or software failure between the primary storage unit and the secondary storage unit. Another possible cause of an inability to write is a failure of the secondary storage unit. If the primary storage unit was in the process of completing one or more write operations at the time of the failure, the primary storage unit may have updated the primary image, but may not have updated any secondary image.
After the failure, it may not be possible for the primary storage unit to determine the status of each secondary image, and specifically whether a particular secondary image matches the primary image. Therefore, the primary storage unit will resynchronize all of the secondary images by copying the primary image block-by-block to each of the secondary storage units.
Unfortunately, copying the entire primary image to all the secondary storage units can take a significant amount of time depending on the image size, the number of secondary storage units, and other factors. It is not uncommon for such a resynchronization to take hours to complete, especially for very large images.
Thus, there is a need for a system and method for quickly resynchronizing primary and secondary images following a failure.
SUMMARY OF THE INVENTION
In one aspect there is provided a method for synchronizing a plurality of data images in a computer system. The plurality of data images include a primary image and at least one secondary image. In accordance with the method, a write request is received from a host computer at a primary image site. A write operation is conducted on the primary image at the primary image site, and attempted on at least one secondary image at at least one secondary image site. If the attempt to write to the at least one secondary image at the at least one secondary image site fails, a fracture log is created at the primary image site, which is representative of changed regions in the primary image at the primary image site, whereby the log can be used to synchronize the primary image and the secondary image once it becomes possible to write to the at least one secondary image.
In a more specific aspect, the fracture log which is maintained only in the event of a failure, is a bitmap of the changed regions that have been affected on at least one LU as a result of the write request. In a yet still more specific aspect, the primary image at the primary image site is updated at the same time that the at least one secondary image is updated at the at least one secondary image site in response to the write request. After the updates are made, specifically in the case of synchronous mirrors, the primary image site communicates to the host that the update to both sites is complete. Yet more specifically, if the write request to the at least one secondary image site fails, the fracture log representative of changed regions is created at the primary image site which is representative of changed regions at the image at the primary image site, and is used to effect writing to the at least one secondary image at the at least one secondary image site when it becomes possible to write to the at least one secondary image, thereby
Hayman Kenneth John
Hotle William Paul
Norris Christopher Adam
Taylor Alan Lee
Amsbury Wayne
Cortina A. Jose
Daniels Daniels & Verdonik, P.A.
EMC Corporation
Nguyen Cindy
LandOfFree
Method and system for establishing, maintaining, and using a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for establishing, maintaining, and using a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for establishing, maintaining, and using a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3348057