Method and apparatus for providing and maximizing concurrent...

Electrical computers and digital processing systems: processing – Processing architecture – Microprocessor or multichip or multimodule processor having...

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C345S503000, C345S532000

Reexamination Certificate

active

06434688

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to computer architecture, and more particularly, to memory-sharing architectures which include graphics capabilities.
2. State of the Art
As the density of solid state memories increases, oversized memories are being wastefully used for purposes which optimally require specialized memory configurations (e.g., a graphics refresh). One reason for this is that manufacturers attempt to produce memory sizes which will achieve a broad range of applicability and a high volume of production. The more popular, and thus more cost-effective memories, tend to be fabricated with square aspect ratios or with tall, thin aspect ratios (i.e., a large number of fixed length words) that are not readily suited to specialized uses.
Although uses which can exploit memories with these popular aspect ratios can be implemented in a relatively cost-effective manner, specialized uses which cannot exploit these aspect ratios can be proportionately more expensive to implement. The expense associated with implementing specialized uses assumes one of two forms: (1) the increased cost associated with purchasing a memory which does not conform to a readily available and widely used memory configuration; or (2) the increased cost associated with purchasing a readily available memory which is much larger than needed to implement a specialized use (e.g., a relatively square memory which must be tall enough to obtain a desired width, even though only a relatively small number of rows in the memory are needed for the purpose at hand).
The foregoing memory capacity problem is typically referred to as the memory granularity problem: expensive chips can be purchased and used efficiently or inexpensive memory chips can be purchased and used inefficiently. This problem is especially significant in computer systems which implement graphics functions, since these systems typically include a dedicated, high speed display memory. Specialized display memories are usually required because typically refresh for the graphics display (e.g., for a 1280×1024 display) consumes virtually all of the available bandwidth of a typical dynamic random access memory (DRAM).
To update a video line on a high resolution graphics display, a graphics refresh optimally requires a memory having a short, wide aspect ratio. Display memories used as frame buffers for high resolution graphics displays have therefore become an increasingly larger fraction of a system's overall cost due to the foregoing memory problem. For display memories, even a two megabyte memory can be unnecessarily large, such that it cannot be effectively used. An exemplary display memory for a current high-end display of 1280×1024 pixels requires just over one megabyte of memory. Thus, almost one-half of the display memory remains unused.
For example,
FIG. 1
illustrates a typical computer system
100
which includes graphics capabilities. The
FIG. 1
computer system includes a central processing unit (CPU)
102
, a graphics controller
104
and a system controller
106
all connected to a common bus
108
having a data portion
110
and an address portion
112
.
The graphics controller
104
is connected to display memory
114
(e.g., random access memory, or RAM) by a memory bus having a memory address bus
116
and a memory data bus
118
. A random access memory digital-to-analog converter (RAMDAC)
120
provides signals (e.g., analog RGB color signals) used to drive a graphics display.
The system controller is connected to system memory
122
by a separate memory address bus
124
. A memory data bus
126
is connected directly between the common data bus
108
and the system memory. The system memory can also include a separate cache memory
128
connected to the common bus to provide a relatively high-speed portion for the system memory.
The graphics controller
104
mediates access of the CPU
102
to the display memory
114
. For system memory transfers not involving direct memory access (DMA), the system controller
106
mediates access of the CPU
102
to system memory
122
, and can include a cache controller for mediating CPU access to the cache memory
128
.
However, the
FIG. 1
configuration suffers significant drawbacks, including the granularity problem discussed above. The display memory
114
is limited to use in connection with the graphics controller and cannot be used for general system needs. Further, because separate memories are used for the main system and for the graphics memory, a higher number of pin counts render integration of the
FIG. 1
computer system difficult. The use of separate controllers and memories for the main system and the graphics also results in significant duplication of bus interfaces, memory control and so forth, thus leading to increased cost. For example, the maximum memory required to handle worst case requirements for each of the system memory and the graphics memory must be separately satisfied, even though the computer system will likely never run an application that would require the maximum amount of graphics memory and main store memory simultaneously. In addition, transfers between the main memory and the graphics require that either the CPU or a DMA controller intervene, thus blocking use of the system bus.
Attempts have been made to alleviate the foregoing drawbacks of the
FIG. 1
system by integrating system memory with display memory. However, these attempts have reduced duplication of control features at the expense of system performance. These attempts have not adequately addressed the granularity problem.
Some attempts have been made, particularly in the area of portable and laptop systems, to unify display memory and system memory. For example, one approach to integrated display memory and system memory is illustrated in FIG.
2
. However, approaches such as that illustrated in
FIG. 2
suffer significant drawbacks. For example, refreshing of the display via the graphics controller requires that cycles be stolen from the main memory, rendering performance unpredictable. Further, these approaches use a time-sliced arbitration mode for allocating specific time slots among the system controller and the graphics controller, such that overall system performance is further degraded.
In other words, overall performance of the
FIG. 2
system is limited by the bandwidth of the single memory block, and the high demands of graphics refresh function alone introduce significant performance degradation. The allocation of memory bandwidth between display access and system access using fixed time-slots only adds to performance degradation. Because the time slots must be capable of handling the worst case requirements for each of the system memory and display memory subsystems, the worst possible memory allocation is forced to be the normal case.
Examples of computers using time-slice access to an integrated memory are the Commodore and the Amiga. The Apple II computer also used a single memory for system and display purposes. In addition, the recently-released Polar™ chip set of the present assignee, for portable and laptop systems, makes provision for integrated memory.
A different approach is described in a document entitled “64200 (Wingine™) High Performance ‘Windows™ Engine’”, available from Chips and Technologies, Inc. In one respect, Wingine is similar to the conventional computer architecture of
FIG. 1
but with the addition of a separate path that enables the system controller to perform write operations to graphics memory. The graphics controller, meanwhile, performs screen refresh only. In another respect, Wingine may be viewed as a variation on previous integrated-memory architectures. Part of system memory is replaced with VRAM, thereby eliminating the bandwidth contention problem using a more expensive memory (VRAM is typically at least twice as expensive as DRAM). In the Wingine implementation, VRAM is not shared but is dedicated for use as graphics memory. Similarly, one version of an Alpha micro

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for providing and maximizing concurrent... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for providing and maximizing concurrent..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for providing and maximizing concurrent... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2901556

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.