Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-03-22
2003-06-24
Iqbal, Nadeem (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
Reexamination Certificate
active
06584586
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an apparatus and method that facilitates the capture and transfer of the internal activity of a computer system. More specifically, the present invention relates to an expansion card that, when installed in a computer system, tracks such internal activities as, e.g., process flow, memory state, and bus activity, and exports a record of such internal activities to a separate system.
2. Description of the Related Art
Evolution of Computer Architecture. Early computer systems included a processor (or CPU), random access memory (RAM), and certain peripheral devices such as a floppy drive, a keyboard and a display. These components were typically coupled together using a network of address, data and control lines, commonly referred to as a “bus.” As computer technology evolved, it became common to connect additional peripheral devices to the computer through ports (such as a parallel port or a serial port), or by attaching the peripheral devices (e.g. an expansion card) to sockets on the main system circuit board (or “motherboard”) which were connected to the system bus. One early bus that still is in use today is the Industry Standard Architecture (ISA) bus. The ISA bus, as the name implies, was a bus standard adopted by computer manufacturers to permit the manufacturers of peripheral devices to design devices that would be compatible with most computer systems. The ISA bus includes 16 data lines and 24 address lines and operates at a clock speed of 8 MHz. A large number of peripheral components have been developed over the years to operate with the ISA protocol.
The components which connect to a given bus receive data from the other components on the same bus via the bus signal lines. Selected components may operate as “bus masters” to initiate data transfers over the bus. Each component on the bus circuit operates according to a bus protocol that defines the purpose of each bus signal and regulates such parameters as bus speed and arbitration between components requesting bus mastership. A bus protocol also determines the proper sequence of bus signals for transferring data over the bus. As computer systems have continued to evolve, new bus circuits offering heightened functionality have replaced older bus circuits, allowing existing components to transfer data more effectively.
One way in which the system bus has been made more effective is to permit data to be exchanged in a computer system without the assistance of the CPU. To implement this design, a new bus architecture called Extended Industrial Standard Architecture (EISA) was developed. The EISA bus protocol permits system components residing on the EISA bus to obtain mastership of the bus and to run cycles on the bus independently of the CPU. Another bus that has become increasingly popular is the Peripheral Component Interconnect (PCI) bus. Like the EISA bus, the PCI bus provides bus master capabilities to devices connected to the PCI bus. The PCI bus also operates at clock speeds of 33 MHz or faster. Current designs contemplate implementing a 100 MHz PCI bus.
To ensure that existing components continue to remain compatible with future generations of computer systems, new computer designs often include many different types of buses. Because different buses operate according to different protocols, the computer design uses bridge devices to interface, or bridge, the different buses. Such a scheme permits components coupled to one bus to exchange data with components coupled to another bus.
System Functionality Testing. A typical computer system includes a large number of functional components that are designed and tested separately to verify their functionality. After the functional components are combined, however, the system as a whole must be tested to verify its functionality. Because of the level of complexity of the individual components and the system as a whole, this system level test often reveals malfunctions not identified in the component-level tests.
In the computer industry, simply knowing of the existence of a system-level malfunction is rarely enough. The malfunction must also be corrected. This presents a challenge, because many system-level malfunctions are transient and difficult to reproduce. Without such reproducibility, the causes of malfunctions are difficult to locate precisely.
One method for precisely locating the cause of most malfunctions is to generate a history of the operations performed by the system. Then, when a malfunction occurs, one can identify the state of the system when the malfunction was recognized and “trace backwards”, using the history to identify the source of the malfunction.
There are some obstacles to this approach which may not be immediately apparent. Many processors today can perform nearly 10
9
operations per second, and existing computer buses can operate at 100 MHz to transfer eight bytes of data per clock cycle. It may require hours or even days for a transient malfunction to manifest itself. This is a mind-boggling amount of history to record. Further, the operations that occur internal to the processor are not normally available for recording.
SUMMARY OF THE INVENTION
The above problems described above are at least in part addressed by the apparatus and method for capturing and transferring internal system activity disclosed herein. In one embodiment, the apparatus includes a bus interface, a memory, an external interface, and circuitry coupling the three together. The bus interface connects to an internal system bus of the system under test. The memory is for storing information indicative of internal system activity. The external interface couples to an external, monitoring system. The circuitry partitions the memory into at least two banks, each having multiple buffers. One of the multiple buffers in each bank is a trace buffer that receives instruction trace information from the processor of the system under test. The multiple buffers may further include a system memory image buffer, a processor data buffer, and a bus activity buffer. When any one of the buffers in a given bank of the memory becomes full, a bank switch occurs. Immediately prior to the bank switch, the contents of system memory are copied to the system memory image buffer, and the internal settings of the processor are similarly copied to the processor data buffer. Advantageously, if any errors are detected at this time, the previous memory bank still contains a pre-error snapshot of the processor contents and memory contents. Furthermore, the previous memory bank has an extensive record of pre-error bus activity and trace history. The external interface provides a means for transporting the memory bank contents to an external system continually, or alternately, whenever the memory bank contents are desired (e.g. when a fault is detected).
REFERENCES:
patent: 5715435 (1998-02-01), Ikei
patent: 5737521 (1998-04-01), Kleber et al.
patent: 5764885 (1998-06-01), Sites et al.
patent: 5875293 (1999-02-01), Bell et al.
patent: 6023580 (2000-02-01), Sifter
Bonura Tim
Iqbal Nadeem
Kivlin B. Noäl
Meyertons Hood Kivlin Kowert & Goetzel P.C.
LandOfFree
Apparatus and method for capturing and transferring internal... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method for capturing and transferring internal..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for capturing and transferring internal... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3113579