High availability computer system and methods related thereto

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

714 3, 714710, G06F 1100

Patent

active

061227560

DESCRIPTION:

BRIEF SUMMARY
FIELD OF INVENTION

The present invention relates to computer systems and more particularly to a high availability computer system that automatically senses, diagnoses and de-configures/re-configures a faulted computer system to improve availability as well as related methods for providing high availability.


BACKGROUND OF THE INVENTION

When procuring a computer system in a business environment, an important factor considered is the availability of the computer to perform/operate. This can affect profitability as well as work/job performance. There are four basic design concepts used alone or in combination to improve availability.
One design technique is commonly referred to as "fault tolerant." A computer system employing this technique is designed to withstand a hard fault that could shut down another type of computer system. Such a design typically involves replicating hardware and software so an applications program is running simultaneously in multiple processors. In this way, if a hard fault occurs in one processor or subsystem, the application program running in the other processor(s)/subsystem(s) still provides an output. Thus, as to the user, the computer system has performed its designated task. In addition to multiple processors, a voting scheme can be implemented, whereby the outputs from the multiple processors are compared to determine the correct output.
Fault tolerant systems are complex, essentially require multiple independent processing systems and, as such, are very expensive. Further, although the system is fault tolerant, once a fault occurs it is necessary for a service representative to arrive on site, diagnosis and repair the faulted path/sub-system. This makes maintenance expensive.
Another technique, involves designing components such that they are highly reliable and, therefore, unlikely to fail during an operational cycle. This technique is common for space, military and aviation applications where size and weight limitations of the intended use (e.g., a satellite) typically restrict the available design techniques. Highly reliable components are typically expensive and also make maintenance activities expensive to maintain these design characteristics.
Such expenses may make a computer system commercially unacceptable for a given application. In any event, once a system has a failure, a service representative must be dispatched to diagnosis and repair the failed system. When dealing with military/aviation applications, the vehicle/item housing the failed component must be brought to a repair facility. However, until the system is repaired it is unavailable. As such, this increases maintenance costs and makes such repairs/replacement activities critical path issues.
A third technique involves clustering multiple independent computer systems together such that when one computer system fails, its work is performed by any one of the other systems in the cluster. This technique is limited to those applications where there are, or there is a need for, a number of independent systems. It is not usable for a stand alone system. Also, in order for this type of system to work each independent computer system must be capable of accessing the data and application program of any of the systems in the cluster. For example, a central data storage device (e.g. hard drive) is provided that can be accessed by any of the computer systems. In addition to the limited applicability, the foregoing is complex, expensive and raises data security issues.
A fourth technique involves providing redundant power supplies and blowers. Thus, the failure of a blower or power supply does not result in shutdown of the computer system. However, providing redundancy for other computer systems components is not viable because a service representative must be brought in to diagnosis the cause of failure so the machine can be repaired and returned to operability.
The fourth technique also has included providing a computer system with a mechanism to automatically re-boot the system following a system crash or hang.

REFERENCES:
patent: 3069562 (1962-12-01), Steele
patent: 3226569 (1965-12-01), James
patent: 4644498 (1987-02-01), Bedard et al.
patent: 4801869 (1989-01-01), Sprogis
patent: 4873685 (1989-10-01), Mills, Jr.
patent: 4920540 (1990-04-01), Baty
patent: 4939694 (1990-07-01), Eaton et al.
patent: 4961013 (1990-10-01), Obermeyer, Jr. et al.
patent: 5146585 (1992-09-01), Smith, III
patent: 5157781 (1992-10-01), Harwood et al.
patent: 5159273 (1992-10-01), Wright et al.
patent: 5260979 (1993-11-01), Parker et al.
patent: 5271019 (1993-12-01), Edwards et al.
patent: 5285153 (1994-02-01), Ahanin et al.
patent: 5392297 (1995-02-01), Bell et al.
patent: 5396619 (1995-03-01), Walton
patent: 5485604 (1996-01-01), Miyoshi et al.
patent: 5487074 (1996-01-01), Sullivan
patent: 5519714 (1996-05-01), Nakamura et al.
patent: 5533188 (1996-07-01), Palumbo
patent: 5535405 (1996-07-01), Byers et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

High availability computer system and methods related thereto does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with High availability computer system and methods related thereto, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High availability computer system and methods related thereto will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-1084344

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.