Data processing: software development – installation – and managem – Software program development tool – Translation of code
Reexamination Certificate
1999-02-08
2002-03-05
Powell, Mark R. (Department: 2762)
Data processing: software development, installation, and managem
Software program development tool
Translation of code
Reexamination Certificate
active
06353924
ABSTRACT:
BACKGROUND OF THE INVENTION
Computers are known to terminate abnormally, or crash, during program execution for many reasons, including accessing invalid memory locations, going into an infinite loop, running out of memory, accessing an invalid device, and so on. Although modern software engineering methodologies attempt to minimize the possibility of crashes, they have not been able to eliminate them.
When a computer runs an important aspect of a business, it is critical that the system be able to recover from a crash as quickly as possible, and that the cause of the crash be identified and fixed to prevent further crash occurrences, and even more importantly, to prevent the problem that caused the crash from causing other damage such as data corruption.
The first step in fixing the problem that causes a crash is to first find the problem. Finding the problem when computer crashes in production is particularly difficult because of the lack of information provided by the computer on the events leading to the crash. In modern mainframe computer environments, for example, tools exist that provide information about (1) the last instruction which executed when the computer crashed, and (2) data stored in registers and memory at the instant the crash occurred. Some of these tools also provide limited information on the sequence of subprogram calls that eventually led to the crash.
Systems such as Abend-Aid(tm) from Compuware Corp. provide only the last instruction before a crash. Abend-Aid also provides information on the state of the system when it crashed. The state includes the final values of registers and memory locations.
Where multiple programs run on a computer system and call each other, some crash-analysis systems also provide information on the call sequence. In other words, the user can obtain the sequence of inter-program calls preceding the crash.
Several packages have existed for nearly two decades that provide address traces of programs. For example, Henry, “Tracer-Address and Instruction Tracing for the VAX Architecture, ” Unpublished Memo, University of California, Berkeley, November, 1984, or Agarwal, Sites, and Horowitz, “ATUM: A New Technique for Capturing Address Traces Using Microcode,” In Proceedings of the 13th Annual Symposium on Computer Architecture, Pages 119-127, June 1986, or Ball and Larus, “Optimally Profiling and Tracing Programs,” TR #1031, September 1991, Computer Sciences Department, University of Wisconsin-Madison. These address tracing packages focus on creating address traces of complete program runs or of sampled intervals of program runs.
These tracing packages are not concerned with computer crashes to trigger a backtrace sequence. Since their major focus is to collect complete address traces, these techniques are not concerned with the amount of storage space required to store the trace information, for example, in memory or on disk, or in being active in production execution of application programs. Tracing packages also do not provide an integrated mechanism to correlate and display traced addresses with source-level statements to facilitate debugging of computer crashes.
Isolating the reason for a crash is somewhat easier when the crash happens during program development because the program can be compiled in debug mode and executed within a debugger. Within a debugger, the program is run slowly and more information is collected than during a normal production run, so that when the program crashes the user has more information with which to diagnose the problem.
Unfortunately, it is often difficult to reproduce a crash in debug mode, because of the difficulty of faithfully reproducing within a debug environment the set of events that led to a production run crash.
Within a debugger such as “gdb,” a user can stop the program at any point during its execution. Debuggers provide information on system state, such as program variable values at the halt point. By asking for a stack dump, the user can also obtain the sequence of function calls (if any) that led to the specific function within which the program is halted.
SUMMARY OF THE INVENTION
Unfortunately, existing technologies do not provide information on the specific sequence of instructions that were executed prior to the instruction that crashed or faulted. Discovering the exact sequence of instructions that executed prior to a crash is a difficult problem, made even harder when a program crashes in a production environment, because execution speed cannot be reduced significantly.
The present invention is a method for producing such a sequence of instructions, or a crash instruction trace. A crash instruction trace includes the instruction that crashed and some or all of instructions that preceded it. If the crash instruction trace contains all of the instructions executed from the start of the program to the crash point, then this sequence of instructions is called the complete crash instruction trace.
The crash instruction trace can also contain information on the specific times at which each instruction was last executed, in which case the trace is called a time-stamped crash instruction trace. The availability of a crash instruction trace can facilitate isolating the problem that caused a crash, thereby speeding up the process of crash recovery or system stabilization.
A complete crash instruction trace can become very large. For example, a computer running 100 million instructions per second will produce a 100 million instructions per second that must be recorded in a complete trace. Therefore, it is sometimes preferable to store a last instruction trace.
A last instruction trace is a sequence of instructions sorted by the last time at which an instruction was executed. A last instruction trace contains each instruction at most once. Accordingly, the maximum size of the last instruction trace is bounded by the size of the program itself.
As an example, suppose a program contains the following eight instructions, each represented as a letter: A,B,C,D,E,F,G,H. Further suppose that during a successful execution of the program the execution sequence is A, B, C, F, G, F, G, F, G, F, G, B, C, F, G, F, G, F, G, F, G, H. For the purpose of the example, assume that the program starts at precisely 1 AM and that each instruction executes in 1 microsecond (&mgr;sec).
Now, suppose the program crashes at the last execution of the statement G. Then, the trace A, B, C, F, G, F, G, F, G, F, G, B, C, F, G, F, G, F, G, F, G is the complete crash instruction trace. B, C, F, G, F, G, F, G, F, G is a partial crash instruction trace. The corresponding last crash instruction trace is A, B, C, F, G.
The time-stamped crash instruction trace is:
Inst:
Timestamp:
A
1AM
B
1AM + 1 &mgr;sec
C
1AM + 2 &mgr;secs
F
1AM + 3 &mgr;secs
G
1AM + 4 &mgr;secs
F
1AM + 5 &mgr;secs
G
1AM + 6 &mgr;secs
F
1AM + 7 &mgr;secs
G
1AM + 8 &mgr;secs
F
1AM + 9 &mgr;secs
G
1AM + 10 &mgr;secs
B
1AM + 11 &mgr;secs
C
1AM + 12 &mgr;secs
F
1AM + 13 &mgr;secs
G
1AM + 14 &mgr;secs
F
1AM + 15 &mgr;secs
G
1AM + 16 &mgr;secs
F
1AM + 17 &mgr;secs
G
1AM + 18 &mgr;secs
F
1AM + 19 &mgr;secs
G
1AM + 20 &mgr;secs
The last time-stamped crash instruction trace is:
Inst.:
Timestamp:
A
1AM
B
1AM + 11 &mgr;secs
C
1AM + 12 &mgr;secs
F
1AM + 19 &mgr;secs
G
1AM + 20 &mgr;secs
Other types of traces, such as a first instruction trace, can also be stored. Like the last instruction trace, the first instruction trace contains only one reference to each instruction. However, unlike the last instruction trace, it stores the sequence of instructions in the order in which they were first referenced.
Instruction traces can be important for purposes other than crash recovery, such as performance tuning and debugging, in which case some system event or program event or termination condition can trigger the writing out of an instruction trace. The present invention applies to all of these event types. In this more general case, the instruction tr
Agarwal Anant
Ayers Andrew E.
Schooler Richard
Hamilton Brook Smith & Reynolds P.C.
Holmes Michael B.
Incert Software Corporation
Powell Mark R.
LandOfFree
Method for back tracing program execution does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for back tracing program execution, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for back tracing program execution will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2849100