ATI’s Radeon X1900 Video Card SeriesTue, Jan 24, 2006 - 6:00 AM
Improving On The X1800 Design
The X1800 was ATI?s first SM 3.0 part for the desktop and brought features such as Adaptive Anti-Aliasing, High Dynamic Range, and Angle independent Anisotropic Filtering. It?s also the first card to be able to perform Anti-Aliasing and HDR together. The X1900 claims the same features of its predecessor but takes them to a new level. So where the X1800 had added all of those features, X1900 delivers more frames per second when you have these options turned on. Let?s take a look at how ATI was able to accomplish this..
ATI X1900 GPU
ATI was able to improve upon Radeon X1800 by keeping much of the architecture the same, but increasing the number of shader processors from 16 to a total of 48 on the X1900. There are also now 48 discrete flow control units compared to R520?s 16. In terms of Pixel Shader operations per second, the X1800 has a theoretical limit of 60 billion, while the X1900 soundly trumps it with 166 billion! This brings things such as HDR and HDR+AA to levels of performance not seen before. It also means that enabling AA takes even less of a performance drop than before.
The take home message here is that the ATI X1900 series has triple the shader performance of the previous generation video cards and nearly identical launch prices. Having 48 pixel shader processors game developers are able to use longer shaders to generate rich, vibrant scenes and since more calculations per pixel can be done the X1900 brings consumers one step closer to Cinematic 3D! A pair of X1900 video cards in CrossFire will have 96 shader processors, which makes the 32 found in the X1800 CrossFire platform look dated only months after it was released! To get a better understanding of the shader architecture let’s take a look at the X1800.
Now look at the ATI X1900 series shader architecture.
Comparing the two pictures you can see the difference in architecture between the R520 and R580. The X1900’s shader engine is similiar to that of the X1800, but simply bigger and able to handle more data flow. Basically ATI took a 4 cyclinder engine and bolted another side to it making it a V-8. This is something many auto makers have done to reduce development costs and now we are seeing something similar in the GPU industry.
Shadow Map Acceleration and Fetch4
Texture lookups are a common opteration in 3D rendering. Shadow mapping is a widely used class of techniques that places importance on texture filtering. This method of rendering shawdows works by first rendering the scene from the point of view of a shadow-casting light source. You don’t see the results displayed but the information is stored in a special shadow map texture. The scene is then rendered from a standard point of view and each pixel is checked against the shadow map, where if it’s determined that there is an object between the standard view and the light source. If there is an object, the pixels are in shadow and can be darkened.
A limitation of shadow maps is that they create less realistic hard-edged shadows. Techniques to soften the shawdos often filter the shadow map, which can be done by taking samples and then combining them in a pixel shader. Using a larger number of samples can result in higher quality shadows, but also requires a large number of texture lookups which can impact performance. Dynamic branching can be used to improve the performance of the shadow rendering by detecting pixels that lie on or near shadow edges. These pixels can then use a lookup to determine if they are in or out of shadow. To further improve this technique, the X1900 includes a new texture sampling feature called Fetch4. It works by exploting the fact that most textures are composed of color values (Red, Green, Blue, Alpha/transparency). The texture units are designed to sample and filter all four components from one texture address simultaneously. However, when looking up different types of textures with single component values, like shadow maps, Fetch4 instead allows four values from adjacent addresses to be sampled simultaneously. This effectively increases the texture sampling rate by a factor of 4.With Ultra-Threading and Fetch4, the X1900 can render the more realistic looking soft shadows at speeds approaching those of hard-edged shadow mapping techniques.