Apparatus and method for monitoring a computer system to...

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S227000

Reexamination Certificate

active

06374367

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to monitoring the performance of computer systems, and more particularly to gathering performance statistics by sampling, and analyzing the sampled statistics to guide system optimization.
BACKGROUND OF THE INVENTION
Computer systems are getting more sophisticated and faster, yet software application performance is not keeping pace. For example, in a typical four-way issue processor only about one in twelve of the available issue slots is being put to good use. It is important to understand why the software execution flow cannot take full advantage of the increased computing power available for processing instructions. Similar issues arise in other devices in computer systems, including graphics controllers, memory systems, input/output controllers, and network interfaces: actual performance is often less than the peak potential performance, and it is important to understand why.
It is common to blame such problems for processors on memory latencies, in fact, many software applications spend many cycles waiting for data transfers to complete. Modem memories are typically arranged in a multi-level hierarchy. There, the data flow is complex and difficult to determine, especially when multiple contexts are concurrently competing for the same memory resource such as cache blocks. Other problems, such as branch mispredicts and cache misses also waste processor cycles and consume memory bandwidth for needlessly referenced data.
Input/Output interfaces and network controllers of computer systems are also becoming more sophisticated. In many implementations, the interfaces and controllers include microprocessors and buffer memories whose dynamic behavior is becoming more difficult to measure and understand as complexity increases.
Independent of the general causes, system architects, and hardware and software engineers need to know which transactions are stalling, what data are bottlenecked, and why in order to improve the performance of modem computer systems.
Typically, this is done by generating a “profile” of the behavior of a computer system while it is operating. A profile is a record of performance data. Frequently, the profile is presented graphically or statistically so that performance bottlenecks can readily be identified.
Profiling can be done by instrumentation and simulation. With instrumentation, additional code is added to executing programs to monitor specific events. Simulation attempts to emulate the behavior of the entire system in an artificial environment rather than executing the program in the real system. Also, instrumentation can only be used only for processor pipelines, not for other devices.
Each of these two methods has its drawbacks. Instrumentation perturbs the system's true behavior due to the added instructions and extra data references. In other words, on large scale and complex systems instrumentation fails in two aspects. The system is slowed down, and the performance data is bad, or at best, sketchy.
Simulation avoids perturbation and overhead. However, simulation only works for small well defined problems that can readily be modeled. It is extremely difficult, if not impossible, to simulate a large scale system, with thousands of users connected via fiber optic links to network controllers, accessing terabytes of data using dozens of multi-issue processors. Imagine modeling a Web search engine, such as Digital's Alta Vista, that responds to tens of millions of hits each day from all over the world. Each hit perhaps offering up hundreds Web pages as search results.
Hardware implemented event sampling has been used to provide profile information for processors. Hardware sampling has a number of advantages over simulation and instrumentation: it does not require modifying software programs to measure their performance. Sampling works on complete systems, with a relatively low overhead. Indeed, recently it has been shown that low-overhead sampling-based profiling can be used to acquire detailed instruction-level information about pipeline stalls and their causes. However, many hardware sampling techniques lack flexibility because they are designed to measure specific events in isolation.
It is desired to provide a generalized method and apparatus for monitoring the performance of operating computer systems. The method should be able to monitor processors, memory sub-systems, I/O interfaces, graphics controllers, network controllers, or any other component that manipulates digital signals.
The monitoring should be able to sample arbitrary transactions and record relevant information about each. In contrast with event-based system, arbitrary transaction monitoring should allow one to monitor not only discrete events, but also events in any combination. It should also be possible to relate the sampled events to individual transactions such as instructions, or memory references, or contexts in which the transactions arose. In addition, it should be possible to relate the sampled data to multiple concurrent transactions in order to gain a true understanding of the system. All this should be possible, without perturbing the operation of the system, other than the time required to read the desired performance data.
SUMMARY OF THE INVENTION
Provided is a method and apparatus for monitoring a computer system including a plurality of functional units, such as processors, memories, I/O interfaces, and network controllers.
Transactions to be processed by a particular functional unit of the computer system are selected for monitoring. The transactions can be selected randomly, or concurrently. State information is stored while the selected transactions are processed by the functional unit. The state information is analyzed to guide optimization.
In one aspect, multiple different functional units can concurrently be sampled.


REFERENCES:
patent: 4084231 (1978-04-01), Capozzi et al.
patent: 4481583 (1984-11-01), Mueller
patent: 4583165 (1986-04-01), Rosenfeld
patent: 4590550 (1986-05-01), Eilert et al.
patent: 4800521 (1989-01-01), Carter et al.
patent: 4821178 (1989-04-01), Levin et al.
patent: 4845615 (1989-07-01), Blasciak
patent: 5103394 (1992-04-01), Blasciak
patent: 5151981 (1992-09-01), Westcott et al.
patent: 5269017 (1993-12-01), Hayden et al.
patent: 5287508 (1994-02-01), Hejna, Jr. et al.
patent: 5301299 (1994-04-01), Pawlowski et al.
patent: 5321836 (1994-06-01), Crawford et al.
patent: 5339425 (1994-08-01), Vanderah et al.
patent: 5379427 (1995-01-01), Hiroshima
patent: 5379432 (1995-01-01), Orton et al.
patent: 5388242 (1995-02-01), Jewett
patent: 5418973 (1995-05-01), Ellis et al.
patent: 5446876 (1995-08-01), Levine et al.
patent: 5450349 (1995-09-01), Brown, III et al.
patent: 5450586 (1995-09-01), Kuzara et al.
patent: 5450609 (1995-09-01), Schultz et al.
patent: 5452440 (1995-09-01), Salsburg
patent: 5463775 (1995-10-01), DeWitt et al.
patent: 5479629 (1995-12-01), Angjelo et al.
patent: 5479652 (1995-12-01), Dreyer et al.
patent: 5485574 (1996-01-01), Bolosky et al.
patent: 5493673 (1996-02-01), Rindos et al.
patent: 5515538 (1996-05-01), Kleiman
patent: 5528753 (1996-06-01), Fortin
patent: 5530964 (1996-06-01), Alpert et al.
patent: 5537541 (1996-07-01), Wibecan
patent: 5572672 (1996-11-01), Dewitt et al.
patent: 5581482 (1996-12-01), Weidenman et al.
patent: 5581745 (1996-12-01), Muraoka et al.
patent: 5594741 (1997-01-01), Kinzelman et al.
patent: 5594864 (1997-01-01), Trauben
patent: 5603004 (1997-02-01), Kurpanek et al.
patent: 5608892 (1997-03-01), Wakerly
patent: 5623627 (1997-04-01), Witt
patent: 5630157 (1997-05-01), Dwyer, III
patent: 5649136 (1997-07-01), Shen et al.
patent: 5651112 (1997-07-01), Matsuno et al.
patent: 5691920 (1997-11-01), Levine et al.
patent: 5748468 (1998-05-01), Notenboom et al.
patent: 5751945 (1998-05-01), Levine et al.
patent: 5765204 (1998-06-01), Bakke et al.
patent: 5768500 (1998-06-01), Agrawal et al.
patent: 5774718 (1998-06-01), Aoshima et al.
patent: 5799143 (1998-08-01), Butt et al.
patent: 5802378 (1998-09-01), Arndt et al.
pat

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Apparatus and method for monitoring a computer system to... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Apparatus and method for monitoring a computer system to..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for monitoring a computer system to... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2853256

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.