Real mode translation look-aside buffer and method of operation

Electrical computers and digital processing systems: memory – Address formation – Address mapping

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S156000

Reexamination Certificate

active

06301647

ABSTRACT:

TECHNICAL FIELD OF THE INVENTION
The present invention is directed, in general, to microprocessors and, more specifically, to a cache memory based microprocessor that employs a translation look-aside buffer (TLB) capable of operating in real addressing mode.
BACKGROUND OF THE INVENTION
The ever-growing requirement for high performance computers demands that state-of-the-art microprocessors execute instructions in the minimum amount of time. Over the years, efforts to increase microprocessor speeds have followed different approaches. One approach is to increase the speed of the clock that drives the processor. As the clock rate increases, however, the processor's power consumption and temperature also increase. Increased power consumption increases electrical costs and depletes batteries in portable computers more rapidly, while high circuit temperatures may damage the processor. Furthermore, processor clock speed may not increase beyond a threshold physical speed at which signals may traverse the processor. Simply stated, there is a practical maximum to the clock speed that is acceptable to conventional processors.
An alternate approach to improving processor speeds is to reduce the number of clock cycles required to perform a given instruction. Under this approach, instructions will execute faster and overall processor “throughput” will thereby increase, even if the clock speed remains the same. One technique for increasing processor throughput is pipelining, which calls for the processor to be divided into separate processing stages (collectively termed a “pipeline”). Instructions are processed in an “assembly line” fashion in the processing stages. Each processing stage is optimized to perform a particular processing function, thereby causing the processor as a whole to become faster.
“Superpipelining” extends the pipelining concept further by allowing the simultaneous processing of multiple instructions in the pipeline. Consider, for example, a processor in which each instruction executes in six stages, each stage requiring a single clock cycle to perform its function. Six separate instructions can be processed simultaneously in the pipeline, with the processing of one instruction completed during each clock cycle. Therefore, the instruction throughput of an N stage pipelined architecture is, in theory, N times greater than the throughput of a non-pipelined architecture capable of completing only one instruction every N clock cycles.
Another technique for increasing overall processor speed is “superscalar” processing. Superscalar processing calls for multiple instructions to be processed per clock cycle. Assuming that instructions are independent of one another (i.e., the execution of an instruction does not depend upon the execution of any other instruction), processor throughput is increased in proportion to the number of instructions processed per clock cycle (“degree of scalability”). If, for example, a particular processor architecture is superscalar to degree three (i.e., three instructions are processed during each clock cycle), the instruction throughput of the processor is theoretically tripled.
A cache memory is a small but very fast memory that holds a limited number of instructions and data for use by the processor. One of the most frequently employed techniques for increasing overall processor throughput is to minimize the number of cache misses and to minimize the cache access time in a processor that implements a cache memory. The lower the cache access time, the faster the processor can run. Also, the lower the cache miss rate, the less often the processor is stalled while the requested data is retrieved from main memory and the higher the processor throughput is. There is a wealth of information describing cache memories and the general theory of operation of cache memories is widely understood. This is particularly true of cache memories implemented in x86 microprocessor architectures.
Many techniques have been employed to reduce the access time of cache memories. However, the cache access time is still limited by the rate at which data can be examined in, and retrieved from, the RAM circuits that are internal to a conventional cache memory. This is in part due to the rate at which address translation devices, such as the translation look-aside buffer (TLB), translate linear (or logical) memory addresses into physical memory addresses. If the TLB has a comparatively long access time for retrieving data, then the translation of the logical memory address into a physical address is comparatively slow. The slower this translation is, the slower the cache memory is in its overall operation.
A significant portion of the latency of a cache memory and its associated TLB is the complex switching and multiplexing networks interconnecting the main cache memory and the TLB and other parts of the processor. In conventional x86 processors, the cache memory and its TLB receive addresses from a number of address generating sources within the processor. Some of the addresses are generated when the processor is operating in real mode and do not require translation by the TLB. Other addresses are generated when the processor is operating in paging enabled mode and must be translated in the TLB. Thus, there are frequently multiple paths interconnecting the same addressing generating sources with the cache memory and/or the TLB in order to service both real mode and paging mode. This results in complex switching and multiplexing gate arrays that add additional delays to the time required to translate addresses and retrieve data from the cache memory.
Therefore, there is a need in the art for improved cache memories that maximize processor throughput. There is a further need in the art for improved cache memories having a reduced access time. In particular, there is a need for improved cache memories that minimize cache latencies related to switching circuitry used to service both real mode and paging mode.
SUMMARY OF THE INVENTION
The limitations inherent in the prior art described above are overcome by an improved address translation device providing physical addresses and adapted for use in an x86-compatible processor capable of operating in real mode and paging mode and having a physically-addressable cache. In one embodiment, the address translation device comprises: 1)a tag array for storing received untranslated addresses in selected ones of N tag entries in the tag array during real mode operations and paging mode operations; and 2) a data array for storing translated physical addresses corresponding to the untranslated addresses in selected ones of N data entries in the data array, wherein the untranslated addresses stored in the tag array during real mode operations are physical addresses equal to the corresponding translated physical addresses stored in the data array.
In one embodiment of the present invention, the untranslated addresses stored in the tag array during paging mode operations are linear addresses.
In another embodiment of the present invention, the address translation device further comprises a flag array for storing mode flags corresponding to the translated physical addresses in selected ones of N flag entries in the flag array.
In still embodiment of the present invention, the mode flags indicate whether the corresponding translated physical addresses were stored in the data array during real mode operations.
In yet another embodiment of the present invention, the mode flags indicate whether the corresponding translated physical addresses were stored in the data array during paging mode operations.
In other embodiments of the present invention, the address translation device further comprises a region configuration array for storing region configuration bits corresponding to the translated physical addresses in selected ones of N region configuration entries in the region configuration array.
In further embodiments of the present invention, the address translation device is an L
1
translation look-aside buffer providing is physical address

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Real mode translation look-aside buffer and method of operation does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Real mode translation look-aside buffer and method of operation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Real mode translation look-aside buffer and method of operation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2569791

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.