Client-server computer system with application recovery of...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C709S241000, C709S241000, C711S161000, C714S016000

Reexamination Certificate

active

06182086

ABSTRACT:

TECHNICAL FIELD
This invention relates to client-server computer systems. More particularly, this invention relates to methods for recovering from system crashes in a manner that ensures that the applications running on the clients and servers persist across the crash.
BACKGROUND
Computer systems occasionally crash. A “system crash” is an event in which the computer quits operating the way it is supposed to operate. Common causes of system crashes include power outage, application operating error, and computer goblins (i.e., unknown and often unexplained malfunctions that tend to plague even the best-devised systems and applications). System crashes are unpredictable, and hence, essentially impossible to anticipate and prevent.
A system crash is at the very least annoying, and may result in serious or irreparable damage. For standalone computers or client workstations, a local system crash typically results in loss of work product since the last save interval. The user is inconvenienced by having to reboot the computer and redo the lost i
1
work. For servers and larger computer systems, a system crash can have a devastating impact on many users, including both company employees as well as its customers.
Being unable to prevent system crashes, computer system designers attempt to limit the effect of system crashes. The field of study concerning how computers recover from system crashes is known as “recovery.” Recovery from system crashes has been the subject of much research and development.
In general, the goal of redo recovery is to return the computer system after a crash to a previous and presumed correct state in which the computer system was operating immediately prior to the crash. Then, transactions whose continuations are impossible can be aborted. Much of the recovery research focuses on database recovery for database computer systems, such as network database servers or mainframe database systems. Imagine the problems caused when a large database system having many clients crashes in the midst of many simultaneous operations involving the retrieval, update, and storage of data records. Database system designers attempt to design the database recovery techniques which minimize the amount of data lost in a system crash, minimize the amount of work needed following the crash to recover to the pre-crash operating state, and minimize the performance impact of recovery on the database system during normal operation.
FIG. 1
shows a database computer system
20
having a computing unit
22
with processing and computational capabilities
24
and a volatile main memory
26
. The volatile main memory
26
is not persistent across crashes and hence is presumed to lose all of its data in the event of a crash. The computer system also has a non-volatile or stable database
28
and a stable log
30
, both of which are contained on stable memory devices, e.g. magnetic disks, tapes, etc., connected to the computing unit
22
. The stable database
28
and log
30
are presumed to persist across a system crash. The persistent database
28
and log
30
can be combined in the same storage, although they are illustrated separately for discussion purposes.
The volatile memory
26
stores one or more applications
32
and a resource manager
34
. The resource manager
34
includes a volatile cache
36
, which temporarily stores data destined for the stable database
28
. The data is typically stored in the stable database and volatile cache in individual units, such as A cache manager
38
executes on the processor
24
to manage movement of data pages between the volatile cache
36
and the stable database
28
. In particular, the cache manager
38
is responsible for deciding which data pages should be moved to the stable database
28
and when the data pages are moved. Data pages that are moved from the cache to the stable database are said to be “flushed” to the stable state. In other words, the cache manager
38
periodically flushes the cached state of a data page to the stable database
28
to produce a stable state of that data page which persists in the event of a crash, making recovery possible.
The resource manager
34
also has a volatile log
40
that temporarily stores log records for operations, which are to be moved into the stable log
30
. A log manager
42
executes on the processor
24
to manage when the operations are moved from the volatile log
40
to the stable log
30
. The transfer of an operation from the volatile log to the stable log is known as a log flush.
During normal operation, an application
32
executes on the processor
24
. The resource manager receives requests to perform operations on data from the application. As a result, data pages are transferred to the volatile cache
36
on demand from the stable database
28
for use by the application. During execution, the resource manager
34
reads, processes, and writes data to and from the volatile cache
36
on behalf of the application. The cache manager
38
determines, independently of the application, when the cached Data State is flushed to the stable database
28
.
Concurrently, the operations being performed by the resource manager on behalf of the application are being recorded in the volatile log
40
. The log manager
42
determines, as guided by the cache manager and the transactional requirements imposed by the application, when the operations are posted as log records on the stable log
30
. A logged operation is said to be “installed” when the versions of the pages containing the changes made by the operation have been flushed to the stable database.
When a crash occurs, the application state (i.e., address space) of any executing application
32
, the data pages in volatile cache
36
, and the operations in volatile log
40
all vanish. The computer system
20
invokes a recovery manager. It begins at the last flushed state on the stable database
28
and replays the operations posted to the stable log
30
to restore the database of the computer system to the state as of the last stably logged operation just prior to the crash.
While database recovery techniques are helpful for recovering data, the database techniques offer no help in recovering an application from a system crash. Usually all active applications using the database are wiped out during a crash. Any state in an executing application is erased and cannot usually be continued across a crash.
There has been some work in designing recovery procedures that preserve applications across a system crash. One preferred approach is an application recovery system developed by David Lomet, an inventor in this invention. The application recovery system is described in a series of patent applications:
1. U.S. Ser. No. 08/814,808, entitled “Database Computer System With Application Recovery”, filed Mar. 10, 1997;
2. U.S. Ser. No. 08/813,982, entitled “Database Computer System With Application Recovery And Dependency Handling Read Cache”, filed Mar. 10, 1997;
3. U.S. Ser. No. 08/832,870, entitled “Database Computer System With Application Recovery And Dependency Handling Write Cache”, filed Apr. 4, 1997; and
4. U.S. Ser. No. 08,826,610, entitled “Database Computer System With Application Recovery And Recovery Log Sequence Numbers To Optimize Recovery”, filed Apr. 4, 1997.
All of these patent applications are assigned to Microsoft Corporation and are incorporated by reference. These applications are collectively referred to as the “Lomet applications” throughout this disclosure.
Another approach is to make the application “stateless.” Between transactions, the application is in its initial state or a state internally derived from the initial state without reference to the persistent state of the database or to other input. If the application fails between transactions, there is nothing about the application state that cannot be re-created based on the static state of the stored form of the application. Should the transaction abort, the application is replayed, thereby re-executing the transaction as if the transaction executed somewha

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Client-server computer system with application recovery of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Client-server computer system with application recovery of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Client-server computer system with application recovery of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2527219

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.