Electrical computers and digital data processing systems: input/ – Access arbitrating – Access prioritizing
Reexamination Certificate
1998-11-02
2001-05-15
Kim, Kenneth S. (Department: 2183)
Electrical computers and digital data processing systems: input/
Access arbitrating
Access prioritizing
C710S116000, C711S213000, C712S207000
Reexamination Certificate
active
06233645
ABSTRACT:
BACKGROUND OF THE INVENTION
Modem microprocessors comprise many resources such as registers and memory for which various requesters contend. For example, a typical microprocessor has various layers of memory including local instruction and data caches, lower level backup caches (B-cache), main memory, storage devices such as disk drives, and storage available over a network. Generally, these layers form a hierarchical set of memories where at one extreme, the local caches are very fast but tend to be small, whereas at the other extreme, main memory, disk storage, and network storage tend to have very large capacities, but access times are also magnitudes larger than for local memory.
To reduce the impact of the long latency between processor and main memory or other long latency storage, hardware prefetching techniques are utilized. The CPU predicts which memory blocks it will utilize next and requests those blocks from the memory hierarchy before they are actually needed.
Because it cannot be predicted with certainty which branch a branch instruction in the instruction stream will take, these prefetched instructions are said to be speculative. Unfortunately, these prefetches can consume bandwidth to the lower level caches, main memory, etc., reducing the amount of available bandwidth for memory transactions that are needed as soon as possible, for example, for demand misses and cache-victim processing.
SUMMARY OF THE INVENTION
The present invention addresses this problem by providing a mechanism which dynamically detects when a high percentage of memory-access bandwidth is needed for high-priority operations such as demand misses, and which, based on this detection, shuts down low priority requesters such as hardware prefetchers until the demand for the bandwidth is reduced.
A preferred embodiment of the present invention realizes this mechanism by counting the number of high-priority utilizations from the last N opportunities in which the high-priority utilizations could have been requested. When the count exceeds some threshold, indicating high usage by the high-priority requester, a portion of low-priority requests are throttled or disabled. Priority may be determined, for example, by measuring the degree of speculative use of the resource. Thus, speculative prefetches have a lower priority than demand accesses.
In other words, the present invention works on the principle that if bandwidth utilization due to high-priority accesses has been high in the recent past, then it will continue to be high due to high-priority accesses in the near future. To that end, the present invention prohibits hardware prefetchers, which have a lower priority, from making requests during high bandwidth utilization periods to increase overall performance.
Accordingly, a method for arbitrating among a plurality of requesters of resources in a computer system comprises selecting a first requester based on a history of program execution. Other requesters which compete with the first requester are inhibited.
Preferably, the history is based on past utilization of the resources. Preferably, memory, including but not limited to main memory and cache memory, and associated data paths, are resources from which a demand fetcher and a speculative prefetcher request instructions or data. The demand fetcher is the first requester, while the speculative prefetcher is a competing requester.
In a preferred embodiment of the present invention, low-priority utilization of a resource in a digital processor, such as speculative prefetching, is limited in favor of high-priority utilization of the resource such as demand fetching, by determining a value which is predictive of high-priority utilization of the resource. If the predictive value is greater than some threshold, low-priority utilization of the resource is inhibited, or throttled. If the predictive value is less than or equal to the threshold, low-priority utilization of the resource is allowed. Preferably, the threshold is a predefined constant.
Furthermore, the predictive value is preferably determined by counting the number of opportunities in which the resource could have been utilized for a high-priority need, regardless of whether it actually was so utilized. The number of actual high-priority utilizations of the resource is also counted, and divided by the opportunity count, yielding the predictive value.
In a preferred embodiment, some actual high-priority utilizations are given more weight than others. Preferably, recent high-priority utilizations are weighted more heavily than less recent high-priority utilizations.
In another preferred embodiment, the number m of actual high-priority uses from the last N use opportunities is first determined, where N is predefined. The value m is compared with some threshold. Speculative accesses are inhibited if m is greater than a threshold, but are allowed otherwise.
In a preferred embodiment, m is determined by providing an N-bit shift register. A 1 is shifted into the N-bit shift register for each high-priority use opportunity in which there has been an actual high-priority use. A 0 is shifted into the shift register for all other use opportunities, such that a count of 1s in the shift register provides a value for m. Optionally, hysteresis can be used about the threshold.
In a further preferred embodiment, a second shift register is provided. When a 0 is shifted in to the N-bit shift register and 1 is shifted out, the second register is shifted left, with a 0 being shifted in from the right. When a 1 is shifted in to the N-bit register and 0 is shifted out, the second register is shifted right, with a 1 being shifted in from the left. When a certain predesignated bit in the second shift register, associated with the threshold has a value of 1, m is deemed to exceed the threshold.
Similarly, where a plurality of options is available, the techniques of the present invention may be employed by associating a priority level with each option. A value is determined based on past behavior of at least one of the options. At least one high-priority option is selected if the determined value exceeds a threshold such that lower priority options are effectively inhibited. On the other hand, any option may be selected if the determined value does not exceed the threshold.
Finally, where multiple threads are executing, the techniques of the present invention may be employed to select a thread based on utilization of bandwidth, or based on a prioritization among the threads.
REFERENCES:
patent: 5421020 (1995-05-01), Levitan
patent: 5421022 (1995-05-01), McKeen et al.
patent: 5454117 (1995-09-01), Puziol et al.
patent: 5506989 (1996-04-01), Boldt et al.
patent: 5561776 (1996-10-01), Popescu et al.
patent: 5613083 (1997-03-01), Glew et al.
patent: 5625837 (1997-04-01), Popescu et al.
patent: 5627983 (1997-05-01), Popescu et al.
patent: 5754800 (1998-05-01), Lentz et al.
patent: 6035392 (2000-03-01), Liptay et al.
patent: 6073132 (2000-06-01), Gehman
patent: 6073159 (2000-06-01), Emer et al.
patent: 0 651 324 A1 (1995-05-01), None
patent: 0 738 962 A2 (1996-10-01), None
patent: 0 763 793 A2 (1997-03-01), None
Lee, D. and Baer, J-L., “Instruction Cache Fetch Policies for Speculative Execution,” Proc. of 22nd Annual International Symposium on Computer Architecture, pp. 357-367 (1995).
Gonzales, J., and Gonzalez, A., “Speculative Execution via Address Prediction and Data Prefetching,” Proc. of International Conference on Supercomputing 1997, ACM, NY, pp. 196-203 (1997).
Appelbe, et al., “Hoisting Branch Conditions—Improving Super-Scalar Processor Performance,” Proc. of Languages and Compilers for Parallel Computing, 8th International Workshop, LCPC, pp. 304-317, (1996).
Chang, P-Y. et al., “Improving Branch Prediction Accuracy by Reducing Pattern History Table Interference,” Proc. 1996 Conference on Parallel Architectures and Compilation Techniques (PACT '96) Boston, MA, pp. 48-57, (Oct. 20-23, 1996).
Chrysos George Z.
Snyder II Wilson P.
Compaq Computer Corporation
Hamilton Brook Smith & Reynolds P.C.
Kim Kenneth S.
LandOfFree
Dynamically disabling speculative prefetch when high... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Dynamically disabling speculative prefetch when high..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Dynamically disabling speculative prefetch when high... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2495499