Apparatus and method for monitoring the performance of a...

Data processing: measuring – calibrating – or testing – Measurement system in a specific environment – Electrical signal parameter measurement system

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C702S180000, C702S186000, C702S187000, C712S227000, C714S039000, C714S030000, C714S047300

Reexamination Certificate

active

06233531

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to computer systems, and more particularly to monitoring the performance of a microprocessor.
2. Description of the Relevant Art
Most computer systems include a microprocessor which functions as a central processing unit (CPU). Modern microprocessors, including the Intel Pentium™ processor, have hardware dedicated for measuring and monitoring various parameters which contribute to the performance of the microprocessor. In the case of the Pentium™ processor, the dedicated hardware includes several model specific registers (MSRs): a 64-bit time stamp counter (TSC) incremented every clock cycle, a control & event select register (CESR), and two 40-bit performance monitor counters (CTRs). The TSC, CESR, and the two CTRs are addressable registers, and their contents may be read or changed by software instructions. Each CTR may be individually programmed, via values stored within the CESR, to count the total number (or duration in clock cycles) of specific “events” occurring within the microprocessor during operation. Such events include memory accesses (e.g., data/code reads and data writes), data/code cache misses, pipeline flushes, and locked bus cycles. The information provided by the dedicated hardware may be used to improve the overall performance of the computer system by “tuning” the memory system or software programs generated by compilers.
Several problems limit the usefulness of the existing performance monitoring hardware. First, there are only two CTRs, thus a maximum of two events may be monitored at any given time. The CTRs are programmed by values stored within the CESR, and there are a fixed number of events to choose from. For example, there are 38 documented events from which to choose for the Pentium™ processor. In order to obtain counts for all events which may be monitored, it is necessary to repeat a test program 19 times while gathering counts for two events during each execution of the test program.
Second, and most importantly, there is no way to correlate the occurrence of an event with the time at which the event occurred. In cases where several factors affect a given aspect of system performance, the total number of events may indicate the presence or absence of a problem, but may not be particularly useful in determining the best solution to a problem. In some cases, a graph of the frequency distribution of an event is much more useful than the total number of events which occurred during execution of a test program.
A histogram is a bar graph of a frequency distribution in which the heights of the bars represent the total number of events occurring within in a corresponding time interval. Forming a histogram involves dividing a time period of interest into time intervals of equal length, and counting the total number of events occurring within each time interval. As a practical matter, summing numbers of events occurring within time intervals reduces the data storage requirements of a data acquisition system performing the counting operation while still providing useful event frequency information.
A good example illustrating the utility of a graph of the frequency distribution of an event is cache misses occurring during execution of a test program.
FIGS. 1 and 2
will now be used to illustrate how such a graph may suggest which of several factors is the most likely cause of a problem. As described above, a desired data acquisition time is divided into time intervals (i.e., histogram time periods) of length t, and the total number of cache misses occurring within each histogram time period t are counted and graphed.
FIG. 1
is a histogram showing the frequency of cache misses occurring within a first memory system during execution of the test program. In the first memory system, the frequency of cache misses follows a trend. The frequency of cache misses is initially high as the empty cache is filled, decreases relatively quickly at an initial rate
10
, then continues to decrease as more needed instructions are located within the cache. Eventually a lowest number of cache misses “M1” is achieved by the first memory system. Sudden increases or “spikes” (e.g., spike
12
) in the frequency of cache misses occur as when new sections of program code are loaded into memory and executed.
FIG. 2
is a histogram showing the frequency of cache misses occurring within a second memory system during execution of the same test program. As in the first memory system, the frequency of cache misses within the second memory system is initially high as the empty cache is filled, and decreases with time as more needed instructions are found within the cache. The initial rate of the decrease
14
is not as great as that of the first memory system, however, and the lowest number of cache misses M2 achieved by the second memory system is substantially greater than M1. Spike
16
corresponds to spike
12
, and occurs as the same section of program code is loaded into memory and executed. Spike
16
occurs later in time than spike
12
as the second memory system is less efficient than the first.
Key factors which affect the frequency of cache misses within a memory system include cache size and the technique used to select information stored within the cache for replacement by “newer” data (i.e., the cache replacement algorithm).
FIG. 1
indicates the cache replacement algorithm of the first memory system is adequate. The best way to reduce the frequency of cache misses and thereby improve the performance of the first memory system is to increase the size of the cache. On the other hand,
FIG. 2
indicates the cache replacement algorithm of the second memory system is probably not working well. Increasing the size of the cache would not be the best way to improve the performance of the second memory system; improving the cache replacement algorithm would probably be more effective.
It would be beneficial to have a microprocessor which includes performance monitoring hardware allowing more than two events to be monitored at any given time and correlating the occurrence of an event with the time at which the event occurred. Such a microprocessor would reduce the number of times a test program must be executed in order to gather performance monitoring information. Such a microprocessor would also allow graphs of numbers of events versus time to be created, greatly enhancing the ability to increase the overall performance of the computer system by “tuning” the memory system or instruction sequences generated by compilers.
SUMMARY OF THE INVENTION
The problems outlined above are in large part solved by an apparatus and method for monitoring the performance of a microprocessor. The apparatus includes performance monitoring hardware incorporated within the microprocessor. The performance monitoring hardware includes a memory unit for storing performance data relating to operations performed by the microprocessor. The memory unit includes multiple memory locations, each memory location being accessed by a unique set of address signals. The performance monitoring hardware further includes circuitry coupled to the memory unit for producing address signals. The apparatus and method center around gathering performance data in order to generate event histograms.
In one embodiment, the performance monitoring hardware further includes an event select register array, a control register, a bus monitor unit, circuitry coupled to the memory unit for producing a set of high order (i.e., most significant) address signals, and a control unit. The event select register array includes n event select registers, where n≧1, and preferably n≧2. Each event select register may contain a binary code corresponding to a selected event. The event select register array allows the performance monitoring hardware to monitor up to n selected events within the microprocessor.
The control register enables and disables a performance data acquisition mode of the performance monitoring hardware. The control register also includes

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Apparatus and method for monitoring the performance of a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Apparatus and method for monitoring the performance of a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for monitoring the performance of a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2520370

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.