Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2001-03-01
2004-12-14
Beausoliel, Robert (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S048000
Reexamination Certificate
active
06832342
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates generally to an improved data processing system and in particular to a method and apparatus for managing data in a configurable data processing system. Still more particularly, the present invention provides a method and apparatus for reducing the amount of data collected for analyzing errors in a configurable data processing system.
2. Description of Related Art
A logical partitioning option (LPAR) within a data processing system (platform) allows multiple copies of a single operating system (OS) or multiple heterogeneous operating systems to be simultaneously run on a single data processing system platform. A partition, within which an operating system image runs, is assigned a non-overlapping sub-set of the platform's resources. These platform allocable resources include one or more architecturally distinct processors with their interrupt management area, regions of system memory, and I/O adapter bus slots. The partition's resources are represented by its own open firmware device tree to the OS image.
Each distinct OS or image of an OS running within the platform are protected from each other such that software errors on one logical partition cannot affect the correct operation of any of the other partitions. This is provided by allocating a disjoint set of platform resources to be directly managed by each OS image and by providing mechanisms for ensuring that the various images cannot control any resources that have not been allocated to it. Furthermore, software errors in the control of an OS's allocated resources are prevented from affecting the resources of any other image. Thus, each image of the OS (or each different OS) directly controls a distinct set of allocable resources within the platform.
These partitions each have one or more processors associated with them. When an error, such as a system checkstop, occurs, a common service processor (CSP) function is employed to perform what is called a scan dump routine. When invoked, this routine collects data, such as, all possible scan rings, array data, and trace arrays. This data is stored in a nonvolatile random access memory (NVRAM) for later analysis. As systems become more complex, more data is needed for error analysis. As a result, room in the NVRAM is used up quickly. To gain more space, persistent storage, such as a hard drive, may be employed. Sometimes, even that space is insufficient. Another problem is the amount of time needed to collect the information increases.
Therefore, it would be advantageous to have improved method, apparatus, and computer implemented instructions for collecting data used in error analysis.
SUMMARY OF THE INVENTION
The present invention solves these problems by providing a method, apparatus, and computer implemented instructions for processing an error in a multiprocessor data processing system. An error is detected within the data processing system. A chip, causing the error, is identified within a plurality of chips to form an identified chip. Data is collected from the identified chip and hardware associated with the identified chip.
REFERENCES:
patent: 5237677 (1993-08-01), Hirosawa et al.
patent: 5379406 (1995-01-01), Wade
patent: 5699505 (1997-12-01), Srinivasan
patent: 5884019 (1999-03-01), Inaho
patent: 5974565 (1999-10-01), Okuhara et al.
patent: 6105150 (2000-08-01), Noguchi et al.
patent: 6182243 (2001-01-01), Berthe et al.
patent: 6543010 (2003-04-01), Gaudet et al.
patent: 6550022 (2003-04-01), Faver
patent: 6615374 (2003-09-01), Moran
patent: 6618823 (2003-09-01), West
Fields, Jr. James Stephen
Lim Michael Youhour
Reick Kevin F.
Chu Gabriel L
McBurney Mark E.
Walder, Jr. Stephen J.
Yee Duke W.
LandOfFree
Method and apparatus for reducing hardware scan dump data does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for reducing hardware scan dump data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for reducing hardware scan dump data will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3277585