AMD’s family of new DirectX 10 graphics cards is comprised of three new SKU’s, the R600, RV630, and RV610. We first reviewed the R600 at launch back in May as the Radeon HD 2900 XT and covered the basics of AMD’s DX10 technology then. The two new chips with we’re looking at today are based on the R600, using the same internal logic but scaled back with less internal parallelism to make them smaller and less expensive cheaper cores.
The RV610 and RV630 share the R600’s unified shader architecture, which dynamically deploys on-chip computational resources to address the most pressing graphics problem at hand, whether it be for pixel shading or vertex processing and manipulation. AMD’s new five-wide execution unit is the basic building block of this shader engine. Each of the five ALU’s in this superscalar unit can execute a separate instruction, leading AMD to count them as five “stream processors.” This makes RV610 and RV630 more efficient than the outgoing Radeon X1300 and X1650 cards. As these are DirectX 10-compliant GPUs, these mid-range parts can perform a number of new tricks not present in the X1300/1600 series, including streaming data out of the shader core, which enables the geometry shader capability required in DX10 specifications.
HD 2400 Pro
HD 2400 XT
HD 2600 XT
HD 2900 XT
|ROPs||4 x2||4 x2||8||4 x2||16 x2|
|Transistors||180 M||180 M||289M||390 M||700 M|
|Memory Size||256 MB||256 MB||256 MB||256 MB||512 MB|
|Memory Bus Width||64 bit||64 bit||128 bit||128 bit||512 bit|
|Core Clock||525 MHz||700 MHz||675 MHz||800 MHz||742 MHz|
|Memory Clock||400 MHz||800 MHz||1000 MHz||1100 MHz||825 MHz|
Beyond the 3D graphics stuff, these new GPU’s pack in some HD video playback capabilities that are not present in HD 2900 XT. The new video playback features are touted as Avivo HD. The most prominent of them is dubbed UVD, for universal video decoder. UVD handles key portions of the decoding process for high-definition codecs like H.264 and VC-1, lowering the CPU usage during playback of HD DVD and Blu-ray movies. These lower-end Radeon HDs also feature hardware acceleration of deinterlacing, vertical and horizontal scaling, and color correction. With considerably more power at its disposal, the HD 2900 XT handles these jobs in its shader core instead.
You may want to display your games or movies on a gloriously mammoth display, and the Radeon HD family has you covered there, as well. All HD-family GPUs have dual-link DVI display outputs with support for HDCP, that wonderful bit of copy protection your video card must support if you want to play HD movies on a digital display. AMD has embedded the HDCP crypto keys directly in these GPUs, simplifying card design by removing the need for a separate crypto ROM. If your big screen has an HDMI connection, the Radeon HD series can utilize it via an HDMI adapter that plugs into a DVI port.
The original HD 2900 XT has four SIMD units, each of which has 16 execution units for a total of 320 stream processors. As you can see in the middle of the diagram above, the HD 2600 XT has three SIMD units, and each of those has only eight execution units onboard. That adds up to 120 stream processors.
AMD has also scaled down the texturing and pixel output capabilities by reducing the number of texture processing units to two and leaving only a single render back-end. This means that it can filter eight texels and output four pixels per clock. Comparing that to the GeForce 8600, that is a deficit nearly double the per-clock texturing and render back-end capacity. The memory bandwidth on the HD 2600 XT is 128-bits wide and while it’s much narrower than the HD 2900, it’s comparable to the rest of the cards in this price range. Considering all that is cut, the HD 2600 packs in 390 million transistors, a far cry from the 600 million in the HD 2900 XT.
In order to bring down cost on the HD 2400 series, AMD cut it down in transistors by whacking it down to just two shader SIMDs with four execution units each, or 40 streaming processors. They left only one texture unit and one render back-end, so it can filter four texels and write out four pixels per clock with 64 bits memory bus. This little guy packs in only 70 million transistors! Though it may be the smallest of the group it still retains a full DX10 feature set, although the multisampled AA is capped at 4x. You can however set up custom tent filters that can improve performance and image quality.
Now that we’ve had a chance to get to know the technology, let’s see how they perform!