System and method for computer operating system protection

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S039000, C714S042000, C712S244000, C711S163000, C711S147000

Reexamination Certificate

active

06240531

ABSTRACT:

BACKGROUND OF THE INVENTION
The invention relates to a system and method for protecting a computer operating system from unexpected errors, and more particularly to a system and method for improving application stability under the Microsoft WINDOWS operating system.
Multitasking, graphics-based operating systems such as Microsoft WINDOWS 95 demand a high degree of expertise from an application programmer. The difficulties inherent in writing synchronized program code in an event-driven, multitasking environment, coupled with a vast and changing system application program interface (“API”) consisting of thousands of functions, inevitably results in the production of software programs that contain errors, or “bugs,” at several points. Even if an application program is tested relatively thoroughly, some portions of the program code may not be sufficiently exercised to locate the errors. And even if the erroneous portion is executed during testing, it may cause seemingly benign errors that pass undetected.
User input to software, through the keyboard, mouse, etc., is frequently unpredictable. Because of this, an application may attempt to process a combination of parameters that was not anticipated by the programmer. In this case, too, the program may respond in a benign manner, or in some circumstances may cause certain regions of memory to be inadvertently altered, or “corrupted.” Those memory regions might “belong” to the program being executed, or might belong to the operating system or another loaded program. Similarly, the corrupted regions might include important data, or they might be unallocated storage. It generally is not possible to be able to determine, in advance, what regions of memory a defective program might attempt to access.
In some circumstances, a programming error may trigger a CPU exception if the program attempts to perform an illegal operation. A CPU exception is the central processing unit's response to an error condition, whether expected or unexpected. For example, an attempt to perform an undefined mathematical operation (such as dividing by zero), an attempt to access a memory location that does not exist, or an attempt to execute code that does not satisfy the CPU's syntax requirements, will typically result in a CPU exception. However, not all CPU exceptions result in a “crash” of the system. A CPU exception will cause a software interrupt. That is, when a CPU exception is encountered, processing immediately stops and is transferred to another program location.
That other program location can contain a segment of program code designed to take whatever action is intended by an operating system programmer. For example, an error message can be presented to the operator. Alternatively, if the CPU exception was expected, then other processing can be performed. Such an exception-handling scheme is used in Microsoft WINDOWS and other operating systems to handle “virtual memory,” in which disk storage is used to virtually increase the amount of system memory. Some of the contents of system memory are “swapped out” to disk and removed from memory. Upon a later attempt to access those contents, a CPU exception will occur because the contents sought do not exist within system memory. The operating system will then handle that expected CPU exception condition, bring the contents back into system memory, and allow the operation to proceed.
Most complex operating systems, including Microsoft WINDOWS 95, use CPU exception handling techniques in performing a wide variety of operations. Even so, in many cases, a CPU exception will reflect an error or malfunction. In such cases, the operating system will typically not be able to correct the malfunction, and can only present an error message (typically a cryptic one, useless to all but the most experienced and knowledgeable programmers) to the computer operator.
Depending on the nature of the malfunction, and the action, if any, that the operating system takes in an attempt to block or remedy the malfunction, the offending program can perform in one of numerous ways. The system may stop executing and appear to be deadlocked. The application may continue executing despite the possibility that important data has been corrupted. The application may be shut down by the operating system, or may so adversely affect the operating system itself that the computer must be restarted with an accompanying loss of data.
One goal of operating system design is to minimize the possibility of data loss, and the general trend for the most advanced operating systems, such as Microsoft WINDOWS NT, has been to shield (as far as possible) the memory regions containing the operating system's code and data from the reach of an application program. In other words, an application program can alter itself and its own data, but would be entirely unable to affect any other portion of the system, including other application programs and the operating system itself.
However, a rigorous implementation of this architecture may not be feasible in a mass-market operating system which is designed to operate on lower-cost systems, which typically have slower CPUs and tighter system memory constraints. Therefore, the Microsoft WINDOWS 95 operating system, which substantially retains the memory architecture of earlier versions of WINDOWS, remains highly susceptible to many types of program errors. In fact, it is relatively easy to write code that will crash the operating system.
One program of this kind is discussed in Schulman, Unauthorized Windows 95 (IDG Books 1994), and is available from //ftp.ora.com/pub/examples/windows/win95.update/ unauthw.html. This program, RANDRW, purports to measure the susceptibility of various operating systems to serious program errors. According to its author, RANDRW makes random memory accesses across the memory range of the system. An access is deemed a “hit” if it is allowed to proceed without being blocked by the operating system. In the WINDOWS 95 environment, Schulman reported a hit rate of approximately 1.5%, indicating that improper accesses were being allowed to occur. It should be noted that the 4 gigabyte address space in which WINDOWS 95 runs is generally about 90% unused and uncommitted, so that the 1.5% hit rate within the 4 gigabyte range translates into a much larger percentage of wrongful memory access and data corruption.
A breakdown of RANDRW memory accesses by address has shown that almost all of the core WINDOWS system components are susceptible to being corrupted in this way. The ease with which a 32-bit application program can affect critical system memory is especially alarming because the entire address range of the processor, including the address ranges occupied by critical system components, is within the accidental reach of the program. Older 16-bit programs are able to reach a narrower extent of system resources, but are still able to cause serious damage.
Unfortunately, it is practically impossible to predict the manner of a malfunction. When one occurs, it is correspondingly difficult to remedy the malfunction so that the program that caused it is able to proceed. If there is an isolated stray access, it may be possible to block the access with no appreciable affect on the program. More likely, an application program was attempting to perform a certain operation when it went awry, and its failure to accomplish the operation will affect further operations. Hence, one fault results in another, and the entire course of the program is altered. In certain circumstances, the CPU context of the program may become damaged. For example, an unbalanced stack may cause the stack pointer to be reset, thereby making continued execution of the program impossible and a haphazard restoration of the CPU context unavailing. A side effect of this latter kind of error is that fault handlers built into the program (even those outside of the application program but executing at the same CPU privilege level as the program) will probably also be unable to execute or will themselves malfunction in the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for computer operating system protection does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for computer operating system protection, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for computer operating system protection will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2550609

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.