Memory processing system and method for accessing memory...

Electrical computers and digital processing systems: memory – Storage accessing and control – Control technique

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S157000, C711S158000, C710S039000, C345S535000

Reexamination Certificate

active

06564304

ABSTRACT:

BACKGROUND
1. Field of the Invention
This invention relates to the field of memory control.
Portions of the disclosure of this patent document contain material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office file or records, but otherwise reserves all copyright rights whatsoever.
2. Background
Computer systems often require the storage and access of large amounts of data. One efficient solution for storing large amounts of data is to use a dynamic random access memory (DRAM) system. Some DRAM systems have multiple memory requesters seeking to access the memory, which can cause contention problems and degrade system performance. This is particularly true in graphics processing systems. The problems of such systems can be better understood by reviewing existing graphics computer and memory systems.
Computer systems are often used to generate and display graphics on a display. Display images are made up of thousands of tiny dots, where each dot is one of thousands or millions of colors. These dots are known as picture elements, or “pixels”. Each pixel has a color, with the color of each pixel being represented by a number value stored in the computer system.
A three dimensional (3D) display image, although displayed using a two dimensional (2D) array of pixels, may in fact be created by rendering of a plurality of graphical objects. Examples of graphical objects include points, lines, polygons, and three dimensional solid objects. Points, lines, and polygons represent rendering “primitives” which are the basis for most rendering instructions. More complex structures, such as three dimensional objects, are formed from a combination or mesh of such primitives. To display a particular scene, the visible primitives associated with the scene are drawn individually by determining those pixels that fall within the edges of the primitive, and obtaining the attributes of the primitive that correspond to each of those pixels. The obtained attributes are used to determine the displayed color values of applicable pixels.
Sometimes, a three dimensional display image is formed from overlapping primitives or surfaces. A blending function based on an opacity value associated with each pixel of each primitive is used to blend the colors of overlapping surfaces or layers when the top surface is not completely opaque. The final displayed color of an individual pixel may thus be a blend of colors from multiple surfaces or layers.
In some cases, graphical data is rendered by executing instructions from an application that is drawing data to a display. During image rendering, three dimensional data is processed into a two dimensional image suitable for display. The three dimensional image data represents attributes such as color, opacity, texture, depth, and perspective information. The draw commands from a program drawing to the display may include, for example, X and Y coordinates for the vertices of the primitive, as well as some attribute parameters for the primitive, and a drawing command. The execution of drawing commands to generate a display image is known as graphics processing.
A graphics processing system accesses graphics data from a memory system such as a DRAM. Often a graphics processing computer system includes multiple processing units sharing one memory system. These processing units may include, for example, a central processing unit (CPU) accessing instructions and data, an input/output (I/O) system, a 2D graphics processor, a 3D graphics processor, a display processor, and others. The 3D processor itself may include multiple sub-processors such as a processor to fetch 3D graphical drawing commands, a processor to fetch texture image data, a processor to fetch and write depth (Z) data, and a processor to fetch and write color data. This means that multiple memory accesses are being sent to the memory simultaneously. This multiple access can cause contention problems.
The goal of a memory system is to get the highest memory capacity and bandwidth at the lowest cost. However, the performance of a shared DRAM system can be severely degraded by competing memory request streams for a number of factors, including page and bank switches, read and write context switches, and latency requirements, among others.
Memory Words and Pages
The data stored in DRAM is organized as one or two-dimensional tiles of image data referred to as memory “words”. A memory word is a logical container of data in a memory. For example, each memory word may contain eight to sixteen pixels of data (e.g., sixteen to thirty-two bytes).
The DRAM memory words are further organized into memory “pages” containing, for example, one to two kilobytes (K byes) of data. The pages are logical groups of memory words. A DRAM therefore consists of multiple memory pages with each page consisting of multiple memory words. The memory words and pages are considered to have word and page “boundaries”. To read data from one memory word and then begin reading data from another memory word is to “cross the word boundary”. Similarly, reading data from one page and then reading data from another page is considered to be crossing a page boundary.
In DRAM memory, it is faster to retrieve data from a single memory word than to cross a word boundary. Similarly it is faster to retrieve data from a single page than to cross a page boundary. This is because peak efficiency is achieved when transferring multiple data values, especially data values that are in adjacent memory locations. For example, for a burst transfer of data in adjacent memory locations, a DRAM may support a transfer rate of eight bytes per clock cycle. The same DRAM device my have a transfer rate of only one byte per nine clock cycles for arbitrary single byte transfers (e.g. those that cross boundaries). Thus, separate accesses to single bytes of data are less efficient than a single access of multiple consecutive bytes of data. Therefore, data in a DRAM memory is typically accessed (written to or read from) as a complete memory word.
The performance cost to access a new memory word from DRAM is much greater than for accessing a data value within the same memory word. Similarly, the cost of accessing a data value from a new memory bank is much greater than from within the same page in the memory bank. Typically, a word in the same page of the same bank can be accessed in the next clock cycle, while accessing a new page can take around 10 extra clock cycles. Furthermore, a new page in a new bank can be accessed in parallel with an access to another bank, so that the 10 extra clock cycles to access a word in a new page in a new bank can be hidden during the access of other words in other pages in other banks.
Read/Write Switches
Access penalties also occur when switching from reads to writes. It is more efficient to do a number of reads without switching to a write operation and vice-versa. The cost in cycles to switch from a write operation to a read operation is significant. Because typically DRAM data pins are bidirectional, that is, they carry both read and write data, and DRAM access is pipelined, that is, reads occur over several clocks, then switching the DRAM from read access to write access incurs several idle clocks to switch the data pin direction and the access pipeline direction.
Latency Requirements
Certain memory requesting processors have specific bandwidth and latency requirements. For example, CPU accesses and requests have low latency requirements and must be satisfied quickly for overall system performance. This is because the CPU typically reads memory on a cache miss, and typically suspends instruction execution when the instructions or data not in the cache are not available within a few clock cycles, and can only support a small number of outstanding memory read requests. Consequently, CPU performance is latency intolerant because CPU execution stops soon after an outstanding memory reque

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Memory processing system and method for accessing memory... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Memory processing system and method for accessing memory..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Memory processing system and method for accessing memory... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3023205

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.