Method for identifying and correcting errors in a central...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S047300, C714S704000, C710S266000

Reexamination Certificate

active

06202174

ABSTRACT:

FIELD OF THE INVENTION
This invention relates to a method for identifying errors in a programmed digital computer and for correcting the identified errors. In particular, this invention relates to a method for monitoring instructions and data that cause errors, analyzing the monitored instructions and data to predict errors and for preventing future errors from occurring, for example by inserting corrective software.
BACKGROUND OF THE INVENTION
MICROSOFT Corporation's Dr. Watson is a debugging tool that logs information regarding internal operations of the operating system “WINDOWS” into a failure report. Dr. Watson logs the information after any application software (typically called just “application”) encounters an error, that MICROSOFT calls “unrecoverable application error (UAE).” See, for example, “An Annotated Dr. Watson Log File,” KB:Windows SDK KBase, Microsoft Development Library, MICROSOFT Corporation, One Microsoft Way, Redmond, Wash.; “Postmortem Debugging,” Matt Pietrek, Dr. Dobb's Journal, September 1992; and “Exception Handlers and Windows Applications,” Joseph Hlavaty, Dr. Dobbs Journal, September 1994; all of which are incorporated by reference herein in their entirety.
Briefly, a Dr. Watson failure report contains information on (1) the name of an application that failed, (2) the error encountered, such as “Exceed Segment Bounds (Read),” (3) the instruction's address at which the failure occurred, (4) the instruction that caused the failure, (5) the contents in various registers, such as CPU registers, instruction pointer (also called “program counter”), stack pointer, base pointer, code segment selector, stack segment selector, data segment selector, extra segment selector, 32-bit registers and flag bits (e.g. Overflow bit, Direction bit, Sign bit, Zero bit, Carry bit, Interrupt bit, Auxcarry bit and Parity bit), (6) WINDOWS installation and environment information, (7) stack frame information such as disassembled instructions surrounding the failed instruction, and several levels of nested function calls leading to the failed instruction, (8) names of all tasks when the failure occurred and (9) user response typed into a “Dr. Watson's Clues” dialog box.
MICROSOFT Corporation recommends that a user exit WINDOWS after a UAE occurs, and if exiting is not possible, to restart the personal computer. See “The DrWatson and MSD Diagnostics,” KB:Windows 3.x KBase, Microsoft Development Library, MICROSOFT Corporation, One Microsoft Way, Redmond, Wash., also incorporated by reference herein in its entirety. MICROSOFT Corporation further recommends that after a UAE occurs, the user should run MICROSOFT DIAGNOSTICS (MSD) that identifies system configuration information, such as the BIOS, video card type, manufacturer, installed processor(s), I/O port status, operating system version, environment settings, hardware devices attached, and additional software running concurrently with MSD. Id. All of these actions can result in loss of valuable data, as well as valuable time before a user can continue using the application.
MICROSOFT Corporation also recommends that after logging several UAEs, the user should send the log to MICROSOFT Corporation, although MICROSOFT Corporation cannot respond to log contributors. Id. Therefore, the user receives no assistance in identifying the problem that caused the UAE and in fixing the application to avoid that particular UAE in future. Moreover, Dr. Watson appears to log only an application's UAEs failures, and cannot be used for debugging other errors, such as errors in the operating system or errors in hardware.
Errors in hardware can be debugged using a built-in “debug” port of the type present in INTEL's P6 (also called “Pentium Pro”) microprocessor. INTEL recommends the P6's debug port as an aid for designing a system board on which the CPU is mounted. See, for example, “Intel equips its P6 with test and debug features,” Electronic Engineering Times, Oct. 16, 1995, n870, pages 1-2, that is incorporated by reference herein in its entirety.
Briefly, the P6 debug port is typically connected to an “in-target probe” (ITP) via a 30-pin connector, and allows access to boundary-scan (JTAG) and built-in-self-test (BIST) structures on the P6 microprocessor. Through an ITP such as ICE-16 available from, for example, American Arium, Tustin, Calif., board designers can control program execution, set break points, monitor the P6's access of registers, memory and input-output devices.
However, a typical user neither has access to an ITP nor the expertise needed to use the ITP. Therefore, the user is still unable to identify the problem that causes a UAE and unable to fix the application to avoid known UAEs in future.
SUMMARY
In accordance with the invention, a central processing unit (CPU) repeatedly interrupts execution of software to save the CPU state, i.e. contents of various storage elements internal to the CPU, until an error occurs during the execution. On occurrence of the error, the CPU once again saves state and only then passes control to a handler in the software for handling the error. Each time the CPU state is saved at locations in memory different from the previous time so that a sequence of CPU states is saved when control passes to the handler. The storage elements whose contents are saved can be of two types: (1) accessible, and (2) inaccessible to the executing software, such as an operating system or an application. Moreover, the above-described state saving steps can be implemented, in different embodiments of the invention, in hardware (e.g. as a state machine) or in software (e.g. in basic-input-output-system (BIOS), in an operating system, as a device driver, or as a utility). In one specific embodiment, the state saving steps are implemented in a computer process by use of x86 instructions.
1
1
The x86 instruction are instructions executable by microprocessors compatible with microprocessors in the 8086, 80286, 80386, 80486, Pentium and Pentium Pro (P6) families of microprocessors available from Intel Corporation, Santa Clara, Calif.
In one embodiment, errors are debugged off-line in a development system, for example, by use of an in-circuit emulator to load the saved CPU states sequentially into the development system, thereby to recreate the error condition. If the frequency of the saved CPU states is too coarse to find the source of the error, the CPU states can be saved more frequently, e.g. after shorter time periods, on every jump instruction, on every input-output instruction, on every function-call instruction, or on some combination these events, depending on one or more flags. The flags can be set, for example, in a configuration file that is checked at the startup of the computer process. The sequence of saved CPU states allows recreation of error conditions otherwise not possible in the prior art. Moreover, the CPU states are saved transparent to the software, thereby allowing recreation of errors in an operating system as well as errors from interaction between the operating system and an application, both of which were not possible in the prior art.
In accordance with the invention, an error can also be debugged proactively by a computer process, even before the error occurs, by use of a number of known-to-be-erroneous instructions and fix instructions corresponding to the known-to-be-erroneous instructions. In one embodiment, the CPU compares instructions to be executed with each of the known-to-be-erroneous instructions, and on finding a match, injects the corresponding fix instructions into the to-be-executed instructions. In this embodiment, these proactive error debugging steps are executed by the state saving process optionally depending on a flag that is set or cleared, for example, in a configuration file. In another embodiment, the proactive error debugging steps are implemented in a different process that executes independent of the state saving process, i.e. does not save CPU states.
Therefore, well known errors e.g. the 80286 jump bug or the PENTIUM

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for identifying and correcting errors in a central... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for identifying and correcting errors in a central..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for identifying and correcting errors in a central... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2488513

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.