Rendering pipeline

Computer graphics processing and selective visual display system – Computer graphics processing – Three-dimension

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Rendering pipeline Rendering pipeline

: 1997-11-25
: 2004-02-24
: Jankus, Almis R. (Department: 2671)
: Computer graphics processing and selective visual display system
: Computer graphics processing
: Three-dimension

: C345S422000
: Reexamination Certificate
: active
: 06697063
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Technical Field
The invention relates to the rendering of graphics in a computer environment. More particularly, the invention relates to a rendering pipeline system that renders graphical primitives displayed in a computer environment.
2. Description of the Prior Art
Graphical representations and user interfaces are no longer an optional feature but rather a requirement for computer applications. There is a pressing need to produce high performance, high quality, and low cost 3D graphics rendering pipelines because of this demand.
Some geometry processing units (e.g. general-purpose host processors or specialized dedicated geometry engines) process geometries in model space into geometries in screen space. Screen space geometries are a collection of geometric primitives represented by screen space vertices and their connectivity information. A screen space vertex typically contains screen x, y, z coordinates, multiple sets of colors, and multiple sets of texture attributes (including the homogeneous components), and possibly vertex normals. Referring to
FIG. 1
, the connectivity information is conveyed using basic primitives such as points, lines, triangles
101
, or strip
102
, or fan
103
forms of these basic primitives.
In a traditional architecture, raster or rasterization refers to the following process:
Given screen x and y positions as well as all other parameter values for all vertices of a primitive, perform parameter setup computation in the form of plain equations; scan convert the primitive into fragments based on screen x and y positions; compute parameter values at these fragment locations. Referring to
FIG. 2
, a traditional rendering pipeline is shown. Screen geometries
201
are rasterized
202
. The shading process
203
is then performed on the graphics primitives. The z/alpha blending process
204
places the final output into the color/z frame buffer
205
which is destined for the video output
206
. There is a serious concern with the memory bandwidth between the z/alpha-blending/pixel-op process
204
and the frame buffer in the memory
205
. To z-buffer 100 Mpixels/s, assuming 4 bytes/pixel for RGBA color, 2 bytes/pixel for z, and 50% of the pixels actually being written into the frame buffer on average due to z-buffering. The memory bandwidth is computed as follows:
100 Mpixels/s*(2 bytes+50%*(4 bytes+2 bytes))/pixel=500 Mbytes/s
The equation assumes a hypothetical perfect prefetch of pixels from frame buffer memory into a local pixel cache without either page miss penalty or wasteful pixels.
The actual memory bandwidth is substantially higher because the read-modify-write cycle required for z-buffering cannot be implemented efficiently without a complicated pipeline and long delay. Alpha blending increases the bandwidth requirement even further. The number is dramatically increased if full-scene anti-aliasing is performed. For example, 4-subsample multi-sampling requires the frame buffer memory access bandwidth by the z/alpha-blending/pixel-op engine
204
to roughly quadruple, i.e. at least 2 Gbytes/s of memory bandwidth is required to do 4-subsample multi-sampling at 100 Mpixels/s. Full-scene anti-aliasing is extremely desirable for improving rendering quality; however, unless either massive memory bandwidth is applied (e.g. through interleaving multiple processors/memories), which leads to rapid hardware cost increase or compromised pixel fill performance, full scene anti-aliasing is impractical to implement under a traditional rendering pipeline architecture. Full scene anti-aliasing also requires the frame buffer size to increase significantly, e.g. to quadruple in the case of 4-subsample multi-sampling.
Another drawback with the traditional rendering pipeline is that all primitives, regardless if they are visible or not, are completely rasterized and corresponding fragments are shaded. Considering a pixel fill rate of 400 Mpixels for non-anti-aliased geometries and assuming a screen resolution of 1280×1024 with a 30 Hz frame rate, the average depth complexity is 10. Even if there is anti-aliasing, the average depth complexity is still between 6~7 for an average triangle size of 50 pixels. The traditional pipeline therefore wastes a large amount of time rasterizing and shading geometries that do not contribute to final pixel colors.
There are other approaches which attempt to resolve these problems. With respect to memory bandwidth, two solutions exist. One approach is to use a more specialized memory design by either placing sophisticated logic on Dynamic Random Access Memory (DRAM) (e.g. customized memory chips such as 3DRAM) or placing a large amount of DRAM on logic. While this can alleviate the memory bandwidth problem to a large extent, it is not currently cost-effective due to the-economy-of-scale. In addition, the frame buffer size in the memory grows dramatically for full-scene anti-aliasing.
The other alternative is by caching the frame buffer on-chip, which is also called virtual buffering. Only a portion of frame buffer can be cached at any time because on-chip memory is limited. One type of virtual buffering uses the on-chip memory as a general pixel cache, i.e. a window into the frame buffer memory. Pixel caching can take advantage of spatial coherence, however, the same location of the screen might be cached in and out of the on-chip memory many times during a frame. Therefore, it uses very little intra-frame temporal coherence (in the form of depth complexity).
The only way to take advantage of intra-frame temporal coherence reliably is through screen space tiling (SST). First, by binning all geometries into tiles (also called screen subdivisions which are based on screen locations). For example, with respect to
FIG. 3
, the screen
301
is partitioned into 16 square, disjoint tiles, numbered
1
302
,
2
303
,
3
304
, up to
16
312
. Four triangles a
313
, b
314
, c
315
, and d
316
are binned as follows:
tile
5
306
: a
313
tile
6
307
: a
313
, b
314
, c
315
tile
7
308
: c
315
, d
316
tile
9
309
: a
313
tile
10
310
: a
313
, b
314
, c
315
, d
316
tile
11
311
: c
315
, d
316
Secondly, by sweeping through screen tiles, processing a tile's worth of geometry at a time, using an on-chip tile frame buffer, producing the final pixel colors corresponding to the tile, and outputting them to the frame buffer. Here, the external frame buffer access bandwidth is limited to the final pixel color output. There is no external memory bandwidth difference between non-anti-aliasing and full-scene anti-aliasing. The memory footprint in the external frame buffer is identical regardless if non-anti-aliasing or full-scene anti-aliasing is used. There is no external depth-buffer memory bandwidth effectively, and the depth-buffer need not exist in the external memory. The disadvantage is that extra screen space binning is introduced, which implies an extra frame of latency.
Two main approaches exist with respect to depth complexity. One requires geometries sorted from front-to-back and rendered in that order and no shading of invisible fragments.
The disadvantages to this first approach are: 1) spatial sorting needs to be performed off-line, and thus only works reliably for static scenes, dynamics dramatically reduce the effectiveness; 2) front-to-back sorting requires depth priorities to be adjusted per frame by the application programs, which places a significant burden on the host processors; and 3) front-to-back sorting tends to break other forms of coherence, such as texture access coherence or shading coherence. Without front-to-back sorting, one-pass shading-after-z for random applications gives some improvement over the traditional rendering pipeline, however, performance improvement is not assured.
The other approach is deferred shading where: 1) primitives are fully rasterized and their fragments are depth-buffered with their surface attributes; and 2) the (partially) visible fragments left in the depth-buffer are shaded using the associated surface attri

Affiliated with

Zhu Ming Benjamin

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Jankus Almis R.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

NVIDIA U.S. Investment Company

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Townsend and Townsend / and Crew LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Rendering pipeline does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Rendering pipeline, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Rendering pipeline will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3337976

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure